{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T19:14:59Z","timestamp":1773429299978,"version":"3.50.1"},"reference-count":48,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2024,5,2]],"date-time":"2024-05-02T00:00:00Z","timestamp":1714608000000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"JSPS KAKENHI","award":["JP23H03411"],"award-info":[{"award-number":["JP23H03411"]}]},{"name":"JSPS KAKENHI","award":["JP22K12144"],"award-info":[{"award-number":["JP22K12144"]}]},{"name":"JSPS KAKENHI","award":["JPMJPF2017"],"award-info":[{"award-number":["JPMJPF2017"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62301368"],"award-info":[{"award-number":["62301368"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Municipal Government of Quzhou","award":["2023D036"],"award-info":[{"award-number":["2023D036"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,5,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Peptides are promising agents for the treatment of a variety of diseases due to their specificity and efficacy. However, the development of peptide-based drugs is often hindered by the potential toxicity of peptides, which poses a significant barrier to their clinical application. Traditional experimental methods for evaluating peptide toxicity are time-consuming and costly, making the development process inefficient. Therefore, there is an urgent need for computational tools specifically designed to predict peptide toxicity accurately and rapidly, facilitating the identification of safe peptide candidates for drug development.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We provide here a novel computational approach, CAPTP, which leverages the power of convolutional and self-attention to enhance the prediction of peptide toxicity from amino acid sequences. CAPTP demonstrates outstanding performance, achieving a Matthews correlation coefficient of approximately 0.82 in both cross-validation settings and on independent test datasets. This performance surpasses that of existing state-of-the-art peptide toxicity predictors. Importantly, CAPTP maintains its robustness and generalizability even when dealing with data imbalances. Further analysis by CAPTP reveals that certain sequential patterns, particularly in the head and central regions of peptides, are crucial in determining their toxicity. This insight can significantly inform and guide the design of safer peptide drugs.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The source code for CAPTP is freely available at https:\/\/github.com\/jiaoshihu\/CAPTP.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae297","type":"journal-article","created":{"date-parts":[[2024,5,2]],"date-time":"2024-05-02T20:25:03Z","timestamp":1714681503000},"source":"Crossref","is-referenced-by-count":27,"title":["Integrated convolution and self-attention for improving peptide toxicity prediction"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7428-4876","authenticated-orcid":false,"given":"Shihu","family":"Jiao","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of Tsukuba , Tsukuba 3058577, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5547-3919","authenticated-orcid":false,"given":"Xiucai","family":"Ye","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Tsukuba , Tsukuba 3058577, Japan"}]},{"given":"Tetsuya","family":"Sakurai","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Tsukuba , Tsukuba 3058577, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6406-1142","authenticated-orcid":false,"given":"Quan","family":"Zou","sequence":"additional","affiliation":[{"name":"Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China , Chengdu 610054, China"},{"name":"Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China , Quzhou 324000, China"}]},{"given":"Ruijun","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Software, Beihang University , Beijing 100191, China"}]}],"member":"286","published-online":{"date-parts":[[2024,5,2]]},"reference":[{"key":"2024052508101724900_btae297-B1","doi-asserted-by":"crossref","first-page":"1527","DOI":"10.4155\/fmc.12.94","article-title":"Therapeutic peptides","volume":"4","author":"Albericio","year":"2012","journal-title":"Future Med Chem"},{"key":"2024052508101724900_btae297-B2","doi-asserted-by":"crossref","first-page":"430","DOI":"10.3390\/molecules26020430","article-title":"A global review on short peptides: frontiers and perspectives","volume":"26","author":"Apostolopoulos","year":"2021","journal-title":"Molecules"},{"key":"2024052508101724900_btae297-B3","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1021\/acs.chemrestox.5b00407","article-title":"Toxicology strategies for drug discovery: present and future","volume":"29","author":"Blomme","year":"2016","journal-title":"Chem Res Toxicol"},{"key":"2024052508101724900_btae297-B4","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1186\/s13321-023-00702-2","article-title":"Deep generative model for drug design from protein target sequence","volume":"15","author":"Chen","year":"2023","journal-title":"J Cheminform"},{"key":"2024052508101724900_btae297-B5","doi-asserted-by":"crossref","first-page":"e60","DOI":"10.1093\/nar\/gkab122","article-title":"iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization","volume":"49","author":"Chen","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2024052508101724900_btae297-B6","doi-asserted-by":"crossref","first-page":"6481","DOI":"10.1021\/acs.analchem.1c00354","article-title":"PepFormer: end-to-end transformer-based Siamese network to predict and enhance peptide detectability based on sequence only","volume":"93","author":"Cheng","year":"2021","journal-title":"Anal Chem"},{"key":"2024052508101724900_btae297-B7","doi-asserted-by":"crossref","first-page":"678","DOI":"10.3892\/ijo.2020.5099","article-title":"Anticancer peptide: physicochemical property, functional aspect and trend in clinical application","volume":"57","author":"Chiangjong","year":"2020","journal-title":"Int J Oncol"},{"key":"2024052508101724900_btae297-B8","doi-asserted-by":"crossref","first-page":"D506","DOI":"10.1093\/nar\/gky1049","article-title":"UniProt: a worldwide hub of protein knowledge","volume":"47","author":"Consortium","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2024052508101724900_btae297-B9","first-page":"2286","volume-title":"International Conference on Machine Learning","author":"","year":"2021"},{"key":"2024052508101724900_btae297-B10","first-page":"3965","article-title":"Coatnet: marrying convolution and attention for all data sizes","volume":"34","author":"","year":"2021","journal-title":"Adv Neural Inform Process Syst"},{"key":"2024052508101724900_btae297-B11","first-page":"4171","author":""},{"key":"2024052508101724900_btae297-B12","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/bts565","article-title":"CD-HIT: accelerated for clustering the next-generation sequencing data","volume":"28","author":"Fu","year":"2012","journal-title":"Bioinformatics"},{"key":"2024052508101724900_btae297-B13","doi-asserted-by":"crossref","first-page":"10427","DOI":"10.1021\/acs.jpclett.3c02398","article-title":"Peptidebert: a language model based on transformers for peptide property prediction","volume":"14","author":"Guntuboina","year":"2023","journal-title":"J Phys Chem Lett"},{"key":"2024052508101724900_btae297-B14","author":"","year":"2022"},{"key":"2024052508101724900_btae297-B15","doi-asserted-by":"crossref","first-page":"3198","DOI":"10.1016\/j.csbj.2021.05.039","article-title":"Representation learning applications in biological sequence analysis","volume":"19","author":"Iuchi","year":"2021","journal-title":"Comput Struct Biotechnol J"},{"key":"2024052508101724900_btae297-B16","doi-asserted-by":"crossref","first-page":"17923","DOI":"10.1038\/s41598-019-54405-6","article-title":"NNTox: gene ontology-based protein toxicity prediction using neural network","volume":"9","author":"Jain","year":"2019","journal-title":"Sci Rep"},{"key":"2024052508101724900_btae297-B17","doi-asserted-by":"crossref","first-page":"2206151","DOI":"10.1002\/advs.202206151","article-title":"Explainable deep hypergraph learning modeling the peptide secondary structure prediction","volume":"10","author":"Jiang","year":"2023","journal-title":"Adv Sci"},{"key":"2024052508101724900_btae297-B18","doi-asserted-by":"crossref","first-page":"236","DOI":"10.2174\/1570163815666180219112806","article-title":"Toxicity of biologically active peptides and future safety aspects: an update","volume":"15","author":"Khan","year":"2018","journal-title":"Curr Drug Discov Technol"},{"key":"2024052508101724900_btae297-B19","doi-asserted-by":"crossref","first-page":"e2000833","DOI":"10.1002\/cbdv.202000833","article-title":"Peptides as active ingredients: a challenge for cosmeceutical industry","volume":"18","author":"Ledwo\u0144","year":"2021","journal-title":"Chem Biodivers"},{"key":"2024052508101724900_btae297-B20","first-page":"3919","article-title":"The antimicrobial peptides and their potential clinical applications","volume":"11","author":"","year":"2019","journal-title":"Am J Translat Res"},{"key":"2024052508101724900_btae297-B21","doi-asserted-by":"crossref","first-page":"e1011214","DOI":"10.1371\/journal.pcbi.1011214","article-title":"BioSeq-Diabolo: biological sequence similarity analysis using Diabolo","volume":"19","author":"Li","year":"2023","journal-title":"PLoS Comput Biol"},{"key":"2024052508101724900_btae297-B22","doi-asserted-by":"crossref","first-page":"e129","DOI":"10.1093\/nar\/gkab829","article-title":"BioSeq-BLM: a platform for analyzing DNA, RNA, and protein sequences based on biological language models","volume":"49","author":"Li","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2024052508101724900_btae297-B23","doi-asserted-by":"crossref","first-page":"121574","DOI":"10.1016\/j.eswa.2023.121574","article-title":"TranSiam: aggregating multi-modal visual features with locality for medical image segmentation","volume":"237","author":"Li","year":"2024","journal-title":"Expert Syst Appl"},{"key":"2024052508101724900_btae297-B24","doi-asserted-by":"crossref","first-page":"bbad320","DOI":"10.1093\/bib\/bbad320","article-title":"Sequence alignment\/map format: a comprehensive review of approaches and applications","volume":"24","author":"Liu","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024052508101724900_btae297-B25","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1109\/MCI.2023.3245731","article-title":"Evolutionary multi-objective optimization in searching for various antimicrobial peptides [feature]","volume":"18","author":"Liu","year":"2023","journal-title":"IEEE Comput Intell Mag"},{"key":"2024052508101724900_btae297-B26","doi-asserted-by":"crossref","first-page":"431","DOI":"10.3390\/pharmaceutics15020431","article-title":"CSM-Toxin: a web-server for predicting protein toxicity","volume":"15","author":"Morozov","year":"2023","journal-title":"Pharmaceutics"},{"key":"2024052508101724900_btae297-B27","doi-asserted-by":"crossref","first-page":"2397","DOI":"10.1093\/bioinformatics\/btac135","article-title":"fastISM: performant in silico saturation mutagenesis for convolutional neural networks","volume":"38","author":"Nair","year":"2022","journal-title":"Bioinformatics"},{"key":"2024052508101724900_btae297-B28","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1186\/s12915-022-01426-9","article-title":"Accurate prediction of functional states of cis-regulatory modules reveals common epigenetic rules in humans and mice","volume":"20","author":"Ni","year":"2022","journal-title":"BMC Biol"},{"key":"2024052508101724900_btae297-B29","doi-asserted-by":"crossref","first-page":"1750","DOI":"10.1016\/j.csbj.2021.03.022","article-title":"The language of proteins: NLP, machine learning & protein sequences","volume":"19","author":"Ofer","year":"2021","journal-title":"Comput Struct Biotechnol J"},{"key":"2024052508101724900_btae297-B30","doi-asserted-by":"crossref","first-page":"323","DOI":"10.3390\/ph15030323","article-title":"Traditional and computational screening of non-toxic peptides and approaches to improving selectivity","volume":"15","author":"Robles-Loaiza","year":"2022","journal-title":"Pharmaceuticals"},{"key":"2024052508101724900_btae297-B31","doi-asserted-by":"crossref","first-page":"3576","DOI":"10.1021\/acs.accounts.1c00239","article-title":"Biomedical applications of a novel class of high-affinity peptides","volume":"54","author":"Saw","year":"2021","journal-title":"Acc Chem Res"},{"key":"2024052508101724900_btae297-B32","doi-asserted-by":"crossref","first-page":"bbac174","DOI":"10.1093\/bib\/bbac174","article-title":"ToxinPred2: an improved method for predicting toxicity of proteins","volume":"23","author":"Sharma","year":"2022","journal-title":"Brief Bioinform"},{"key":"2024052508101724900_btae297-B33","doi-asserted-by":"crossref","first-page":"106322","DOI":"10.1016\/j.compbiomed.2022.106322","article-title":"ToxMVA: an end-to-end multi-view deep autoencoder method for protein toxicity prediction","volume":"151","author":"Shi","year":"2022","journal-title":"Comput Biol Med"},{"key":"2024052508101724900_btae297-B34","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1186\/s13321-023-00767-z","article-title":"Pmf-cpi: assessing drug selectivity with a pretrained multi-functional model for compound-protein interactions","volume":"15","author":"Song","year":"2023","journal-title":"J Cheminform"},{"key":"2024052508101724900_btae297-B35","doi-asserted-by":"crossref","first-page":"121294","DOI":"10.1016\/j.eswa.2023.121294","article-title":"Supervised contrastive representation learning with tree-structured parzen estimator Bayesian optimization for imbalanced tabular data","volume":"237","author":"Tao","year":"2024","journal-title":"Expert Syst Appl"},{"key":"2024052508101724900_btae297-B36","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Van der Maaten","year":"2008","journal-title":"J Mach Learn Res"},{"key":"2024052508101724900_btae297-B37","doi-asserted-by":"crossref","first-page":"3525","DOI":"10.1007\/s00018-019-03138-w","article-title":"Antiviral peptides as promising therapeutic drugs","volume":"76","author":"Vilas Boas","year":"2019","journal-title":"Cell Mol Life Sci"},{"key":"2024052508101724900_btae297-B38","doi-asserted-by":"crossref","first-page":"e46","DOI":"10.1093\/nar\/gkab016","article-title":"DM3Loc: multi-label mRNA subcellular localization prediction and analysis based on multi-head self-attention mechanism","volume":"49","author":"Wang","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2024052508101724900_btae297-B39","author":"","year":"2023"},{"key":"2024052508101724900_btae297-B40","doi-asserted-by":"crossref","first-page":"3017","DOI":"10.1093\/nar\/gkad055","article-title":"DeepBIO: an automated and interpretable deep-learning platform for high-throughput biological sequence prediction, functional annotation and visualization analysis","volume":"51","author":"Wang","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2024052508101724900_btae297-B41","doi-asserted-by":"crossref","first-page":"1514","DOI":"10.1093\/bioinformatics\/btac006","article-title":"ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning","volume":"38","author":"Wei","year":"2022","journal-title":"Bioinformatics"},{"key":"2024052508101724900_btae297-B42","doi-asserted-by":"crossref","first-page":"bbab041","DOI":"10.1093\/bib\/bbab041","article-title":"ATSE: a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural network and attention mechanism","volume":"22","author":"Wei","year":"2021","journal-title":"Brief Bioinform"},{"key":"2024052508101724900_btae297-B43","doi-asserted-by":"crossref","first-page":"btac715","DOI":"10.1093\/bioinformatics\/btac715","article-title":"sAMPpred-GAT: prediction of antimicrobial peptide by graph attention network and predicted peptide structure","volume":"39","author":"Yan","year":"2023","journal-title":"Bioinformatics"},{"key":"2024052508101724900_btae297-B44","doi-asserted-by":"crossref","first-page":"109437","DOI":"10.1016\/j.knosys.2022.109437","article-title":"A class-aware supervised contrastive learning framework for imbalanced fault diagnosis","volume":"252","author":"Zhang","year":"2022","journal-title":"Knowl Based Syst"},{"key":"2024052508101724900_btae297-B45","doi-asserted-by":"crossref","first-page":"3490","DOI":"10.1039\/C7CS00793K","article-title":"Peptide-based nanoprobes for molecular imaging and disease diagnostics","volume":"47","author":"Zhang","year":"2018","journal-title":"Chem Soc Rev"},{"key":"2024052508101724900_btae297-B46","doi-asserted-by":"crossref","first-page":"2465","DOI":"10.3390\/diagnostics13142465","article-title":"A first computational frame for recognizing heparin-binding protein","volume":"13","author":"Zhu","year":"2023","journal-title":"Diagnostics (Basel)"},{"key":"2024052508101724900_btae297-B47","doi-asserted-by":"crossref","first-page":"1281880","DOI":"10.3389\/fmed.2023.1281880","article-title":"Accurately identifying hemagglutinin using sequence information and machine learning methods","volume":"10","author":"Zou","year":"2023","journal-title":"Front Med (Lausanne)"},{"key":"2024052508101724900_btae297-B48","doi-asserted-by":"crossref","first-page":"1291352","DOI":"10.3389\/fmed.2023.1291352","article-title":"Deep-STP: a deep learning-based approach to predict snake toxin proteins by using word embeddings","volume":"10","author":"Zulfiqar","year":"2023","journal-title":"Front Med (Lausanne)"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae297\/57390675\/btae297.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/5\/btae297\/57904511\/btae297.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/5\/btae297\/57904511\/btae297.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,25]],"date-time":"2024-05-25T08:39:39Z","timestamp":1716626379000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae297\/7663469"}},"subtitle":[],"editor":[{"given":"Xin","family":"Gao","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,5,1]]},"references-count":48,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,5,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae297","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,5,1]]},"published":{"date-parts":[[2024,5,1]]},"article-number":"btae297"}}