{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,16]],"date-time":"2026-06-16T15:43:43Z","timestamp":1781624623991,"version":"3.54.5"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2024,9,10]],"date-time":"2024-09-10T00:00:00Z","timestamp":1725926400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"JSPS KAKENHI","award":["JP23H00509"],"award-info":[{"award-number":["JP23H00509"]}]},{"name":"JSPS KAKENHI","award":["JP22H04925"],"award-info":[{"award-number":["JP22H04925"]}]},{"name":"JSPS KAKENHI","award":["JP20H00624"],"award-info":[{"award-number":["JP20H00624"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Antibiotic resistance has emerged as a major global health threat, with an increasing number of bacterial infections becoming difficult to treat. Predicting the underlying resistance mechanisms of antibiotic resistance genes (ARGs) is crucial for understanding and combating this problem. However, existing methods struggle to accurately predict resistance mechanisms for ARGs with low similarity to known sequences and lack sufficient interpretability of the prediction models.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>In this study, we present a novel approach for predicting ARG resistance mechanisms using ProteinBERT, a protein language model (pLM) based on deep learning. Our method outperforms state-of-the-art techniques on diverse ARG datasets, including those with low homology to the training data, highlighting its potential for predicting the resistance mechanisms of unknown ARGs. Attention analysis of the model reveals that it considers biologically relevant features, such as conserved amino acid residues and antibiotic target binding sites, when making predictions. These findings provide valuable insights into the molecular basis of antibiotic resistance and demonstrate the interpretability of pLMs, offering a new perspective on their application in bioinformatics.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The source code is available for free at https:\/\/github.com\/hmdlab\/ARG-BERT. The output results of the model are published at https:\/\/waseda.box.com\/v\/ARG-BERT-suppl.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae550","type":"journal-article","created":{"date-parts":[[2024,9,7]],"date-time":"2024-09-07T01:59:58Z","timestamp":1725674398000},"source":"Crossref","is-referenced-by-count":15,"title":["Prediction of antibiotic resistance mechanisms using a protein language model"],"prefix":"10.1093","volume":"40","author":[{"given":"Kanami","family":"Yagimoto","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering and Bioscience, Graduate School of Advanced Science and Engineering, Waseda University , Tokyo 169-8555,","place":["Japan"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Shion","family":"Hosoda","sequence":"additional","affiliation":[{"name":"Center for Exploratory Research, Research and Development Group, Hitachi, Ltd , Tokyo 185-8601,","place":["Japan"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Miwa","family":"Sato","sequence":"additional","affiliation":[{"name":"Center for Exploratory Research, Research and Development Group, Hitachi, Ltd , Tokyo 185-8601,","place":["Japan"]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9466-1034","authenticated-orcid":false,"given":"Michiaki","family":"Hamada","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Bioscience, Graduate School of Advanced Science and Engineering, Waseda University , Tokyo 169-8555,","place":["Japan"]},{"name":"National Institute of Advanced Industrial Science and Technology Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), , Tokyo 169-8555,","place":["Japan"]},{"name":"Graduate School of Medicine, Nippon Medical School , Tokyo 113-8602,","place":["Japan"]}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2024,9,10]]},"reference":[{"key":"2024101004163401400_btae550-B1","first-page":"3782","author":"Ahmed","year":"2022"},{"key":"2024101004163401400_btae550-B2","doi-asserted-by":"publisher","author":"Ahmed","year":"2024","DOI":"10.1101\/2024.03.20.585944"},{"key":"2024101004163401400_btae550-B3","first-page":"D517","article-title":"CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database","volume":"48","author":"Alcock","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2024101004163401400_btae550-B4","doi-asserted-by":"crossref","first-page":"D690","DOI":"10.1093\/nar\/gkac920","article-title":"CARD 2023: expanded curation, support for machine learning, and resistome prediction at the Comprehensive Antibiotic Resistance Database","volume":"51","author":"Alcock","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2024101004163401400_btae550-B5","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1016\/j.jbiotec.2014.11.024","article-title":"Rifampicin-resistance, rpoB polymorphism and RNA polymerase genetic engineering","volume":"202","author":"Alifano","year":"2015","journal-title":"J Biotechnol"},{"key":"2024101004163401400_btae550-B6","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J Mol Biol"},{"key":"2024101004163401400_btae550-B7","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1186\/s40168-018-0401-z","article-title":"DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data","volume":"6","author":"Arango-Argoty","year":"2018","journal-title":"Microbiome"},{"key":"2024101004163401400_btae550-B8","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene Ontology: tool for the unification of biology","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat Genet"},{"key":"2024101004163401400_btae550-B9","doi-asserted-by":"crossref","first-page":"S470","DOI":"10.1093\/clinids\/8.Supplement_5.S470","article-title":"Classification of \u03b2-lactamases","volume":"8","author":"Bauernfeind","year":"1986","journal-title":"Rev Infect Dis"},{"key":"2024101004163401400_btae550-B10","doi-asserted-by":"publisher","author":"Borelli","year":"2024","DOI":"10.1101\/2024.06.11.598242"},{"key":"2024101004163401400_btae550-B11","doi-asserted-by":"crossref","first-page":"2102","DOI":"10.1093\/bioinformatics\/btac020","article-title":"ProteinBERT: a universal deep-learning model of protein sequence and function","volume":"38","author":"Brandes","year":"2022","journal-title":"Bioinformatics"},{"key":"2024101004163401400_btae550-B12","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1128\/MMBR.65.2.232-260.2001","article-title":"Tetracycline antibiotics: mode of action, applications, molecular biology, and epidemiology of bacterial resistance","volume":"65","author":"Chopra","year":"2001","journal-title":"Microbiol Mol Biol Rev"},{"key":"2024101004163401400_btae550-B13","doi-asserted-by":"publisher","author":"Devlin","year":"2019","DOI":"10.48550\/arXiv.1810.04805"},{"key":"2024101004163401400_btae550-B14","doi-asserted-by":"publisher","author":"Ding","year":"2024","DOI":"10.1101\/2024.03.07.584001"},{"key":"2024101004163401400_btae550-B15","doi-asserted-by":"crossref","first-page":"7112","DOI":"10.1109\/TPAMI.2021.3095381","article-title":"ProtTrans: toward understanding the language of life through self-supervised learning","volume":"44","author":"Elnaggar","year":"2021","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2024101004163401400_btae550-B16","doi-asserted-by":"crossref","first-page":"12728","DOI":"10.1038\/s41598-021-91456-0","article-title":"AMRFinderPlus and the Reference Gene Catalog facilitate examination of the genomic links among antimicrobial resistance, stress response, and virulence","volume":"11","author":"Feldgarden","year":"2021","journal-title":"Sci Rep"},{"key":"2024101004163401400_btae550-B17","first-page":"000748","article-title":"ResFinder \u2013 an open online resource for identification of antimicrobial resistance genes in next-generation sequencing data and prediction of phenotypes from genotypes","volume":"8","author":"Florensa","year":"2022","journal-title":"Microb Genom"},{"key":"2024101004163401400_btae550-B18","first-page":"3150","article-title":"CD-HIT: accelerated for clustering the next-generation sequencing data","volume":"28","author":"Fu","year":"2012","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2024101004163401400_btae550-B19","doi-asserted-by":"crossref","first-page":"iyad031","DOI":"10.1093\/genetics\/iyad031","article-title":"The Gene Ontology knowledgebase in 2023","volume":"224","author":"Consortium TGO, Aleksander SA, Balhoff J","year":"2023","journal-title":"Genetics"},{"key":"2024101004163401400_btae550-B20","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1128\/AAC.01310-13","article-title":"ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes","volume":"58","author":"Gupta","year":"2014","journal-title":"Antimicrob Agents Chemother"},{"key":"2024101004163401400_btae550-B21","doi-asserted-by":"crossref","first-page":"242","DOI":"10.3389\/fpubh.2019.00242","article-title":"Using genomics to track global antimicrobial resistance","volume":"7","author":"Hendriksen","year":"2019","journal-title":"Front Public Health"},{"key":"2024101004163401400_btae550-B22","doi-asserted-by":"crossref","first-page":"1236","DOI":"10.1093\/bioinformatics\/btu031","article-title":"InterProScan 5: genome-scale protein function classification","volume":"30","author":"Jones","year":"2014","journal-title":"Bioinformatics"},{"key":"2024101004163401400_btae550-B23","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2024101004163401400_btae550-B24","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1038\/s12276-021-00569-z","article-title":"Antibiotic resistome from the one-health perspective: understanding and controlling antimicrobial resistance transmission","volume":"53","author":"Kim","year":"2021","journal-title":"Exp Mol Med"},{"key":"2024101004163401400_btae550-B25","doi-asserted-by":"crossref","first-page":"e4529","DOI":"10.1002\/pro.4529","article-title":"AMP-BERT: prediction of antimicrobial peptide function based on a BERT model","volume":"32","author":"Lee","year":"2023","journal-title":"Protein Sci"},{"key":"2024101004163401400_btae550-B26","doi-asserted-by":"crossref","first-page":"100513","DOI":"10.1016\/j.patter.2022.100513","article-title":"Deciphering the language of antibodies using self-supervised learning","volume":"3","author":"Leem","year":"2022","journal-title":"Patterns (N Y)"},{"key":"2024101004163401400_btae550-B27","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1186\/s40168-021-01002-3","article-title":"HMD-ARG: hierarchical multi-task deep learning for annotating antibiotic resistance genes","volume":"9","author":"Li","year":"2021","journal-title":"Microbiome"},{"key":"2024101004163401400_btae550-B28","author":"O\u2019Neill","year":"2016"},{"key":"2024101004163401400_btae550-B29","doi-asserted-by":"crossref","first-page":"D418","DOI":"10.1093\/nar\/gkac993","article-title":"InterPro in 2022","volume":"51","author":"Paysan-Lafosse","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2024101004163401400_btae550-B30","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"2024101004163401400_btae550-B31","doi-asserted-by":"crossref","first-page":"1282","DOI":"10.1093\/bioinformatics\/btm098","article-title":"UniRef: comprehensive and non-redundant UniProt reference clusters","volume":"23","author":"Suzek","year":"2007","journal-title":"Bioinformatics"},{"key":"2024101004163401400_btae550-B32","doi-asserted-by":"crossref","first-page":"941","DOI":"10.1093\/bioinformatics\/btab801","article-title":"NetSolP: predicting protein solubility in Escherichia coli using language models","volume":"38","author":"Thumuluri","year":"2022","journal-title":"Bioinformatics"},{"key":"2024101004163401400_btae550-B33","doi-asserted-by":"crossref","first-page":"D439","DOI":"10.1093\/nar\/gkab1061","article-title":"AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models","volume":"50","author":"Varadi","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2024101004163401400_btae550-B34"},{"key":"2024101004163401400_btae550-B35","author":"World Health Organization","year":"2015"},{"key":"2024101004163401400_btae550-B36","doi-asserted-by":"crossref","first-page":"btad690","DOI":"10.1093\/bioinformatics\/btad690","article-title":"PLM-ARG: antibiotic resistance gene identification using a pretrained protein language model","volume":"39","author":"Wu","year":"2023","journal-title":"Bioinformatics"},{"key":"2024101004163401400_btae550-B37","doi-asserted-by":"crossref","first-page":"vbac023","DOI":"10.1093\/bioadv\/vbac023","article-title":"Prediction of RNA\u2013protein interactions using a nucleotide language model","volume":"2","author":"Yamada","year":"2022","journal-title":"Bioinform Adv"},{"key":"2024101004163401400_btae550-B38","doi-asserted-by":"publisher","DOI":"10.1101\/2024.01.30.577970"},{"key":"2024101004163401400_btae550-B39","author":"Zhou","year":"2023"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae550\/59074482\/btae550.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/10\/btae550\/59694361\/btae550.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/10\/btae550\/59694361\/btae550.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,10]],"date-time":"2024-10-10T07:24:01Z","timestamp":1728545041000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae550\/7754486"}},"subtitle":[],"editor":[{"given":"Xin","family":"Gao","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2024,9,10]]},"references-count":39,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2024,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae550","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.05.04.592288","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,10]]},"published":{"date-parts":[[2024,9,10]]},"article-number":"btae550"}}