{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,18]],"date-time":"2026-06-18T09:07:58Z","timestamp":1781773678774,"version":"3.54.5"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2022,6,21]],"date-time":"2022-06-21T00:00:00Z","timestamp":1655769600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000060","name":"National Institute of Allergy and Infectious Diseases","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000060","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100012737","name":"Department of Health and Human Services","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100012737","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001744","name":"International Science and Technology Center","doi-asserted-by":"publisher","award":["G-2102"],"award-info":[{"award-number":["G-2102"]}],"id":[{"id":"10.13039\/501100001744","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,7,18]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The evolution of drug-resistant pathogenic microbial species is a major global health concern. Naturally occurring, antimicrobial peptides (AMPs) are considered promising candidates to address antibiotic resistance problems. A variety of computational methods have been developed to accurately predict AMPs. The majority of such methods are not microbial strain specific (MSS): they can predict whether a given peptide is active against some microbe, but cannot accurately calculate whether such peptide would be active against a particular MS. Due to insufficient data on most MS, only a few MSS predictive models have been developed so far. To overcome this problem, we developed a novel approach that allows to improve MSS predictive models (MSSPM), based on properties, computed for AMP sequences and characteristics of genomes, computed for target MS. New models can perform predictions of AMPs for MS that do not have data on peptides tested on them. We tested various types of feature engineering as well as different machine learning (ML) algorithms to compare the predictive abilities of resulting models. Among the ML algorithms, Random Forest and AdaBoost performed best. By using genome characteristics as additional features, the performance for all models increased relative to models relying on AMP sequence-based properties only. Our novel MSS AMP predictor is freely accessible as part of DBAASP database resource at http:\/\/dbaasp.org\/prediction\/genome<\/jats:p>","DOI":"10.1093\/bib\/bbac233","type":"journal-article","created":{"date-parts":[[2022,6,20]],"date-time":"2022-06-20T18:05:53Z","timestamp":1655748353000},"source":"Crossref","is-referenced-by-count":37,"title":["Comparative analysis of machine learning algorithms on the microbial strain-specific AMP prediction"],"prefix":"10.1093","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5770-9481","authenticated-orcid":false,"given":"Boris","family":"Vishnepolsky","sequence":"first","affiliation":[{"name":"Ivane Beritashvili Center of Experimental Biomedicine , Tbilisi 0160, Georgia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Maya","family":"Grigolava","sequence":"additional","affiliation":[{"name":"Ivane Beritashvili Center of Experimental Biomedicine , Tbilisi 0160, Georgia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Grigol","family":"Managadze","sequence":"additional","affiliation":[{"name":"Ivane Beritashvili Center of Experimental Biomedicine , Tbilisi 0160, Georgia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Andrei","family":"Gabrielian","sequence":"additional","affiliation":[{"name":"Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health , Bethesda, MD 20892, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Alex","family":"Rosenthal","sequence":"additional","affiliation":[{"name":"Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health , Bethesda, MD 20892, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Darrell E","family":"Hurt","sequence":"additional","affiliation":[{"name":"Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health , Bethesda, MD 20892, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Michael","family":"Tartakovsky","sequence":"additional","affiliation":[{"name":"Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health , Bethesda, MD 20892, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Malak","family":"Pirtskhalava","sequence":"additional","affiliation":[{"name":"Ivane Beritashvili Center of Experimental Biomedicine , Tbilisi 0160, Georgia"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2022,6,21]]},"reference":[{"key":"2022071906095548500_ref1","doi-asserted-by":"crossref","first-page":"bbab083","DOI":"10.1093\/bib\/bbab083","article-title":"Comprehensive assessment of machine learning-based methods for predicting antimicrobial peptides","volume":"22","author":"Xu","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022071906095548500_ref2","doi-asserted-by":"crossref","first-page":"3141","DOI":"10.1021\/acs.jcim.1c00251","article-title":"Alignment-free antimicrobial peptide predictors: improving performance by a thorough analysis of the largest available data set","volume":"61","author":"Pinacho-Castellanos","year":"2021","journal-title":"J Chem Inf Model"},{"key":"2022071906095548500_ref3","doi-asserted-by":"crossref","first-page":"D1094","DOI":"10.1093\/nar\/gkv1051","article-title":"CAMPR3: a database on sequences, structures and signatures of antimicrobial peptides","volume":"44","author":"Waghu","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2022071906095548500_ref4","doi-asserted-by":"crossref","first-page":"4691","DOI":"10.1021\/acs.jcim.0c00841","article-title":"IAMPE: NMR assisted computational prediction of antimicrobial peptides","volume":"60","author":"Kavousi","year":"2020","journal-title":"J Chem Inf Model"},{"key":"2022071906095548500_ref5","doi-asserted-by":"crossref","DOI":"10.1016\/j.compbiomed.2021.104778","article-title":"iAtbP-Hyb-EnC: prediction of antitubercular peptides via heterogeneous feature representation and genetic algorithm based ensemble learning model","volume":"137","author":"Akbar","year":"2021","journal-title":"Comput Biol Med"},{"key":"2022071906095548500_ref6","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab358","article-title":"PreTP-EL: prediction of therapeutic peptides based on ensemble learning","volume":"22","author":"Guo","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022071906095548500_ref7","doi-asserted-by":"crossref","first-page":"13588","DOI":"10.1073\/pnas.1609893113","article-title":"Mapping membrane activity in undiscovered peptide sequence space using machine learning","volume":"113","author":"Lee","year":"2016","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2022071906095548500_ref8","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1186\/s40779-021-00343-2","article-title":"Antimicrobial peptides: mechanism of action, activity and clinical potential","volume":"8","author":"Zhang","year":"2021","journal-title":"Military Med Res"},{"key":"2022071906095548500_ref9","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1038\/s42003-021-02137-7","article-title":"The lexicon of antimicrobial peptides: a complete set of arginine and tryptophan sequences","volume":"4","author":"Clark","year":"2021","journal-title":"Commun Biol"},{"key":"2022071906095548500_ref10","doi-asserted-by":"crossref","first-page":"588","DOI":"10.1021\/acs.jcim.5b00630","article-title":"First multitarget chemo-Bioinformatic model to enable the discovery of antibacterial peptides against multiple gram-positive pathogens","volume":"56","author":"Speck-Planche","year":"2016","journal-title":"J Chem Inf Model"},{"key":"2022071906095548500_ref11","doi-asserted-by":"crossref","first-page":"490","DOI":"10.1021\/acscombsci.6b00063","article-title":"Enabling the discovery and virtual screening of potent and safe antimicrobial peptides. Simultaneous prediction of antibacterial activity and cytotoxicity","volume":"18","author":"Kleandrova","year":"2016","journal-title":"ACS Comb Sci"},{"key":"2022071906095548500_ref12","article-title":"Toward insights on determining factors for high activity in antimicrobial peptides via machine learning","volume":"7","author":"Nantasenamat","year":"2019","journal-title":"PeerJ"},{"key":"2022071906095548500_ref13","article-title":"AMP0: species-specific prediction of anti-microbial peptides using zero and few shot learning","volume":"19","author":"Gull","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2022071906095548500_ref14","doi-asserted-by":"crossref","first-page":"954","DOI":"10.3389\/fphar.2018.00954","article-title":"Prediction of antitubercular peptides from sequence information using ensemble classifier and hybrid features","volume":"9","author":"Usmani","year":"2018","journal-title":"Front Pharmacol"},{"key":"2022071906095548500_ref15","doi-asserted-by":"crossref","first-page":"972","DOI":"10.1016\/j.csbj.2019.06.024","article-title":"AtbPpred: a robust sequence-based prediction of anti-tubercular peptides using extremely randomized trees","volume":"17","author":"Manavalan","year":"2019","journal-title":"Comput Struct Biotechnol J"},{"key":"2022071906095548500_ref16","volume-title":"Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics","author":"Losin","year":"2021"},{"key":"2022071906095548500_ref17","doi-asserted-by":"crossref","first-page":"471","DOI":"10.3390\/biom11030471","article-title":"Deep learning for novel ant\/imicrobial peptide design","volume":"11","author":"Wang","year":"2021","journal-title":"Biomolecules"},{"key":"2022071906095548500_ref18","doi-asserted-by":"crossref","first-page":"1141","DOI":"10.1021\/acs.jcim.8b00118","article-title":"Predictive model of linear antimicrobial peptides active against gram-negative bacteria","volume":"58","author":"Vishnepolsky","year":"2018","journal-title":"J Chem Inf Model"},{"key":"2022071906095548500_ref19","volume-title":"Proceedings of the 4th International Electronic Conference on Medicinal Chemistry","author":"Vishnepolsky"},{"key":"2022071906095548500_ref20","doi-asserted-by":"crossref","first-page":"697","DOI":"10.1128\/jb.173.2.697-703.1991","article-title":"16S ribosomal DNA amplification for phylogenetic study","volume":"173","author":"Weisburg","year":"1991","journal-title":"J Bacteriol"},{"key":"2022071906095548500_ref21","doi-asserted-by":"crossref","first-page":"1301","DOI":"10.1099\/ijs.0.03005-0","article-title":"Taxonomy of Australian clinical isolates of the genus Photorhabdus and proposal of Photorhabdus asymbiotica subsp. asymbiotica subsp. nov. and P. asymbiotica subsp. australis subsp. nov","volume":"54","author":"Akhurst","year":"2004","journal-title":"Int J Syst Evol Microbiol"},{"key":"2022071906095548500_ref22","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1099\/ijs.0.054171-0","article-title":"Integrating genomics into the taxonomy and systematics of the bacteria and archaea","volume":"64","author":"Chun","year":"2014","journal-title":"Int J Syst Evol Microbiol"},{"key":"2022071906095548500_ref23","doi-asserted-by":"crossref","first-page":"2567","DOI":"10.1073\/pnas.0409727102","article-title":"Genomic insights that advance the species definition for prokaryotes","volume":"102","author":"Konstantinidis","year":"2005","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2022071906095548500_ref24","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1186\/1471-2105-14-60","article-title":"Genome sequence-based species delimitation with confidence intervals and improved distance functions","volume":"14","author":"Meier-Kolthoff","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2022071906095548500_ref25","doi-asserted-by":"crossref","first-page":"D288","DOI":"10.1093\/nar\/gkaa991","article-title":"DBAASP v3: database of antimicrobial\/cytotoxic activity and structure of peptides as a resource for development of new therapeutics","volume":"49","author":"Pirtskhalava","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2022071906095548500_ref26","doi-asserted-by":"crossref","first-page":"D36","DOI":"10.1093\/nar\/gks1195","volume":"41","author":"Benson","year":"2013","journal-title":"Nucleic Acids Research"},{"key":"2022071906095548500_ref27"},{"key":"2022071906095548500_ref28","doi-asserted-by":"crossref","first-page":"539","DOI":"10.3389\/fmicb.2019.00539","article-title":"Emerging strategies to combat ESKAPE pathogens in the era of antimicrobial resistance: a review","volume":"10","author":"Mulani","year":"2019","journal-title":"Front Microbiol"},{"key":"2022071906095548500_ref29","doi-asserted-by":"crossref","first-page":"4791","DOI":"10.3390\/molecules17054791","article-title":"Comparison of different approaches to define the applicability domain of QSAR models","volume":"17","author":"Sahigara","year":"2012","journal-title":"Molecules"},{"key":"2022071906095548500_ref30","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1177\/026119290503300508","article-title":"QSAR applicabilty domain estimation by projection of the training set descriptor space: a review","volume":"33","author":"Jaworska","year":"2005","journal-title":"Altern Lab Anim"},{"key":"2022071906095548500_ref31","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1177\/026119290503300209","article-title":"Current status of methods for defining the applicability domain of (quantitative) structure-activity relationships. The report and recommendations of ECVAM workshop 52","volume":"33","author":"Netzeva","year":"2005","journal-title":"Altern Lab Anim"},{"key":"2022071906095548500_ref32","volume-title":"Ambit Discovery v0.0.4","author":"Jaworska"},{"key":"2022071906095548500_ref33","doi-asserted-by":"crossref","first-page":"598","DOI":"10.1016\/S1369-5274(98)80095-7","article-title":"Global dinucleotide signatures and analysis of genomic heterogeneity","volume":"1","author":"Karlin","year":"1998","journal-title":"Curr Opin Microbiol"},{"key":"2022071906095548500_ref34","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1093\/dnares\/4.3.185","article-title":"Differences in dinucleotide frequencies of human, yeast, and Escherichia coli genes","volume":"4","author":"Nakashima","year":"1997","journal-title":"DNA Res"},{"key":"2022071906095548500_ref35","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1093\/dnares\/5.5.251","article-title":"Genes from nine genomes are separated into their organisms in the dinucleotide composition space","volume":"5","author":"Nakashima","year":"1998","journal-title":"DNA Res"},{"issue":"4","key":"2022071906095548500_ref36","doi-asserted-by":"crossref","first-page":"693","DOI":"10.1101\/gr.634603","article-title":"Informatics for unveiling hidden genome signatures","volume":"13","author":"Abe","year":"2003","journal-title":"Genome Res"},{"issue":"2","key":"2022071906095548500_ref37","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1101\/gr.335003","article-title":"Evolutionary implications of microbial genome tetranucleotide frequency biases","volume":"13","author":"Pride","year":"2003","journal-title":"Genome Res"},{"key":"2022071906095548500_ref38","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1016\/j.ygeno.2009.01.009","article-title":"Estimation of bacterial species phylogeny through oligonucleotide frequency distances","volume":"93","author":"Takahashi","year":"2009","journal-title":"Genomics"},{"key":"2022071906095548500_ref39","doi-asserted-by":"crossref","first-page":"471","DOI":"10.3390\/ph14050471","article-title":"Physicochemical features and peculiarities of interaction of AMP with the membrane","volume":"14","author":"Pirtskhalava","year":"2021","journal-title":"Pharmaceuticals"},{"key":"2022071906095548500_ref40","doi-asserted-by":"crossref","first-page":"1512","DOI":"10.1021\/ci4007003","article-title":"Prediction of linear cationic antimicrobial peptides based on characteristics responsible for their interaction with the membranes","volume":"54","author":"Vishnepolsky","year":"2014","journal-title":"J Chem Inf Model"},{"key":"2022071906095548500_ref41","first-page":"1","article-title":"Time for a change: a tutorial for comparing multiple classifiers through Bayesian analysis","volume":"18","author":"Benavoli","year":"2017","journal-title":"Journal of Machine Learning Research"},{"key":"2022071906095548500_ref42","doi-asserted-by":"crossref","DOI":"10.1093\/database\/bay025","article-title":"AntiTbPdb: a knowledgebase of anti-tubercular peptides","volume":"2018","author":"Usmani","year":"2018","journal-title":"Database"},{"key":"2022071906095548500_ref43","first-page":"D1","article-title":"UniProt: the universal protein knowledgebase in 2021","volume":"49","author":"The UniProt Consdortium","year":"2021","journal-title":"Nucleic Acids Res"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/advance-article-pdf\/doi\/10.1093\/bib\/bbac233\/45017721\/bbac233.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/advance-article-pdf\/doi\/10.1093\/bib\/bbac233\/45017721\/bbac233.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,7,19]],"date-time":"2022-07-19T02:17:53Z","timestamp":1658197073000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac233\/6611915"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,21]]},"references-count":43,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,7,18]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac233","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.01.28.478081","asserted-by":"object"}]},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,7,18]]},"published":{"date-parts":[[2022,6,21]]},"article-number":"bbac233"}}