{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,28]],"date-time":"2026-05-28T16:26:43Z","timestamp":1779985603412,"version":"3.53.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1013397","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T00:00:00Z","timestamp":1773792000000}}],"reference-count":55,"publisher":"Public Library of Science (PLoS)","issue":"3","license":[{"start":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T00:00:00Z","timestamp":1773187200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62372392"],"award-info":[{"award-number":["62372392"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Identifying effector proteins of secretion systems in Gram-negative bacteria is crucial for deciphering their pathogenic mechanisms and guiding the development of antimicrobial strategies. Extracting evolutionary and sequence features using pre-trained protein language models (PLMs) has emerged as an effective approach to improve the performance of effector protein prediction. However, the high-dimensional features generated by PLMs contain extensive general biological information, making it difficult to focus on core features when applied directly to effector protein tasks, which in turn limits prediction performance. In this study, we propose MoCETSE, a deep learning model for predicting effector proteins in Gram-negative bacteria. Specifically, MoCETSE first extracts contextual representations of sequences using the pre-trained protein language model ESM-1b. Subsequently, it refines key functional features via a target preprocessing network to construct more expressive sequence representations. Finally, integrated with a transformer module incorporating relative positional encoding, MoCETSE explicitly models the relative spatial relationships between residues, enabling highly accurate prediction of secreted effector proteins. MoCETSE exhibits excellent and robust performance in both five-fold cross-validation and independent testing. Benchmark results demonstrate that it maintains strong competitiveness compared to existing binary and multi-class predictors. Additionally, the model can effectively perform genome-wide effector protein prediction, showing outstanding specificity and reliability. MoCETSE provides an efficient and robust computational framework for the accurate identification of bacterial effector substrates and offers key biological insights.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1013397","type":"journal-article","created":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T17:35:27Z","timestamp":1773250527000},"page":"e1013397","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":2,"title":["MoCETSE: A mixture-of-convolutional experts and transformer-based model for predicting Gram-negative bacterial secreted effectors"],"prefix":"10.1371","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8812-5737","authenticated-orcid":true,"given":"Hua","family":"Shi","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yihang","family":"Lin","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Dachen","family":"Liu","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Quan","family":"Zou","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"340","published-online":{"date-parts":[[2026,3,11]]},"reference":[{"issue":"6","key":"pcbi.1013397.ref001","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1038\/nrmicro3456","article-title":"Secretion systems in Gram-negative bacteria: structural and mechanistic insights","volume":"13","author":"TRD Costa","year":"2015","journal-title":"Nat Rev Microbiol"},{"issue":"1","key":"pcbi.1013397.ref002","doi-asserted-by":"crossref","DOI":"10.1128\/microbiolspec.VMBF-0012-2015","article-title":"Bacterial Secretion Systems: An Overview","volume":"4","author":"ER Green","year":"2016","journal-title":"Microbiol Spectr"},{"key":"pcbi.1013397.ref003","doi-asserted-by":"crossref","first-page":"1806","DOI":"10.1016\/j.csbj.2021.03.019","article-title":"Computational prediction of secreted proteins in gram-negative bacteria","volume":"19","author":"X Hui","year":"2021","journal-title":"Comput Struct Biotechnol J"},{"issue":"6","key":"pcbi.1013397.ref004","doi-asserted-by":"crossref","first-page":"524","DOI":"10.1016\/j.tim.2021.10.007","article-title":"The type III secretion system effector network hypothesis","volume":"30","author":"J Sanchez-Garrido","year":"2022","journal-title":"Trends Microbiol"},{"issue":"3","key":"pcbi.1013397.ref005","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1007\/s10529-023-03354-2","article-title":"Bacterial type VI secretion system (T6SS): an evolved molecular weapon with diverse functionality","volume":"45","author":"RP Singh","year":"2023","journal-title":"Biotechnol Lett"},{"issue":"6534","key":"pcbi.1013397.ref006","doi-asserted-by":"crossref","DOI":"10.1126\/science.abc9531","article-title":"Type III secretion system effectors form robust and flexible intracellular virulence networks","volume":"371","author":"D Ruano-Gallego","year":"2021","journal-title":"Science"},{"issue":"5","key":"pcbi.1013397.ref007","doi-asserted-by":"crossref","DOI":"10.1128\/mBio.02180-21","article-title":"The Polar Legionella Icm\/Dot T4SS Establishes Distinct Contact Sites with the Pathogen Vacuole Membrane","volume":"12","author":"D B\u00f6ck","year":"2021","journal-title":"mBio"},{"issue":"1","key":"pcbi.1013397.ref008","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1042\/BCJ20230240","article-title":"Specialized killing across the domains of life by the type VI secretion systems of Pseudomonas aeruginosa","volume":"482","author":"J Colautti","year":"2025","journal-title":"Biochem J"},{"key":"pcbi.1013397.ref009","doi-asserted-by":"crossref","first-page":"813094","DOI":"10.3389\/fmicb.2021.813094","article-title":"T1SEstacker: A Tri-Layer Stacking Model Effectively Predicts Bacterial Type 1 Secreted Proteins Based on C-Terminal Non-repeats-in-Toxin-Motif Sequence Features","volume":"12","author":"Z Chen","year":"2022","journal-title":"Front Microbiol"},{"issue":"12","key":"pcbi.1013397.ref010","doi-asserted-by":"crossref","first-page":"2017","DOI":"10.1093\/bioinformatics\/bty914","article-title":"Bastion3: a two-layer ensemble predictor of type III secreted effectors","volume":"35","author":"J Wang","year":"2019","journal-title":"Bioinformatics"},{"issue":"4","key":"pcbi.1013397.ref011","article-title":"T3SEpp: an Integrated Prediction Pipeline for Bacterial Type III Secreted Effectors","volume":"5","author":"X Hui","year":"2020","journal-title":"mSystems"},{"issue":"2","key":"pcbi.1013397.ref012","doi-asserted-by":"crossref","first-page":"1918","DOI":"10.1093\/bib\/bbaa008","article-title":"EP3: an ensemble predictor that accurately identifies type III secreted effectors","volume":"22","author":"J Li","year":"2021","journal-title":"Brief Bioinform"},{"issue":"3","key":"pcbi.1013397.ref013","doi-asserted-by":"crossref","first-page":"931","DOI":"10.1093\/bib\/bbx164","article-title":"Systematic analysis and prediction of type IV secreted effector proteins by machine learning approaches","volume":"20","author":"J Wang","year":"2019","journal-title":"Brief Bioinform"},{"issue":"5","key":"pcbi.1013397.ref014","doi-asserted-by":"crossref","first-page":"1825","DOI":"10.1093\/bib\/bbz120","article-title":"Convolutional neural network-based annotation of bacterial type IV secretion system effectors with enhanced accuracy and reduced false discovery","volume":"21","author":"J Hong","year":"2020","journal-title":"Brief Bioinform"},{"issue":"1","key":"pcbi.1013397.ref015","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab420","article-title":"T4SEfinder: a bioinformatics tool for genome-scale prediction of bacterial type IV secreted effectors using pre-trained protein language model","volume":"23","author":"Y Zhang","year":"2022","journal-title":"Brief Bioinform"},{"key":"pcbi.1013397.ref016","doi-asserted-by":"crossref","first-page":"801","DOI":"10.1016\/j.csbj.2024.01.015","article-title":"T4SEpp: A pipeline integrating protein language models to predict bacterial type IV secreted effectors","volume":"23","author":"Y Hu","year":"2024","journal-title":"Comput Struct Biotechnol J"},{"issue":"1","key":"pcbi.1013397.ref017","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1186\/s12915-024-02064-z","article-title":"T4Seeker: a hybrid model for type IV secretion effectors identification","volume":"22","author":"J Li","year":"2024","journal-title":"BMC Biol"},{"issue":"15","key":"pcbi.1013397.ref018","doi-asserted-by":"crossref","first-page":"2546","DOI":"10.1093\/bioinformatics\/bty155","article-title":"Bastion6: a bioinformatics approach for accurate prediction of type VI secreted effectors","volume":"34","author":"J Wang","year":"2018","journal-title":"Bioinformatics"},{"issue":"3","key":"pcbi.1013397.ref019","doi-asserted-by":"crossref","first-page":"1950019","DOI":"10.1142\/S0219720019500197","article-title":"PyPredT6: A python-based prediction tool for identification of Type VI effector proteins","volume":"17","author":"R Sen","year":"2019","journal-title":"J Bioinform Comput Biol"},{"issue":"1","key":"pcbi.1013397.ref020","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1093\/bib\/bbx078","article-title":"An account of in silico identification tools of secreted effector proteins in bacteria and future challenges","volume":"20","author":"C Zeng","year":"2019","journal-title":"Brief Bioinform"},{"issue":"6637","key":"pcbi.1013397.ref021","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1126\/science.ade2574","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Z Lin","year":"2023","journal-title":"Science"},{"issue":"15","key":"pcbi.1013397.ref022","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"A Rives","year":"2021","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1013397.ref023","first-page":"123","article-title":"Transformer protein language models are unsupervised structure learners.","volume-title":"Proceedings of the 9th International Conference on Learning Representations; 2021 May 3\u20137; Vienna, Austria","author":"R Rao","year":"2021"},{"key":"pcbi.1013397.ref024","first-page":"5998","volume-title":"Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017); 2017 Dec 4\u20139; Long Beach, CA","author":"A Vaswani","year":"2017"},{"issue":"8","key":"pcbi.1013397.ref025","doi-asserted-by":"crossref","first-page":"2102","DOI":"10.1093\/bioinformatics\/btac020","article-title":"ProteinBERT: a universal deep-learning model of protein sequence and function","volume":"38","author":"N Brandes","year":"2022","journal-title":"Bioinformatics"},{"key":"pcbi.1013397.ref026","doi-asserted-by":"crossref","first-page":"1506508","DOI":"10.3389\/fbioe.2025.1506508","article-title":"Evaluating the advancements in protein language models for encoding strategies in protein function prediction: a comprehensive review","volume":"13","author":"J-Y Chen","year":"2025","journal-title":"Front Bioeng Biotechnol"},{"key":"pcbi.1013397.ref027","first-page":"0258","article-title":"DeepSecE: A Deep-Learning-Based Framework for Multiclass Prediction of Secreted Proteins in Gram-Negative Bacteria","volume":"6","author":"Y Zhang","year":"2023","journal-title":"Research (Wash D C)"},{"issue":"5","key":"pcbi.1013397.ref028","article-title":"Unraveling and characterization of novel T3SS effectors in Edwardsiella piscicida","volume":"8","author":"XJ Liao","year":"2023","journal-title":"mSphere"},{"issue":"23","key":"pcbi.1013397.ref029","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/bts565","article-title":"CD-HIT: accelerated for clustering the next-generation sequencing data","volume":"28","author":"L Fu","year":"2012","journal-title":"Bioinformatics"},{"key":"pcbi.1013397.ref030","unstructured":"Hendrycks D, Gimpel K. Gaussian error linear units (GELUs) [Preprint]. arXiv:1606.08415 [cs.LG]. 2016 Jun 27 [revised 2023 Jun 6; version 5]. Available from: https:\/\/arxiv.org\/abs\/1606.08415"},{"issue":"8","key":"pcbi.1013397.ref031","doi-asserted-by":"crossref","first-page":"1099","DOI":"10.1038\/s41587-022-01618-2","article-title":"Large language models generate functional protein sequences across diverse families","volume":"41","author":"A Madani","year":"2023","journal-title":"Nat Biotechnol"},{"issue":"7","key":"pcbi.1013397.ref032","doi-asserted-by":"crossref","first-page":"1023","DOI":"10.1038\/s41587-021-01156-3","article-title":"SignalP 6.0 predicts all five types of signal peptides using protein language models","volume":"40","author":"F Teufel","year":"2022","journal-title":"Nat Biotechnol"},{"key":"pcbi.1013397.ref033","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkaf394","article-title":"The UniProt website API: facilitating programmatic access to protein knowledge","volume":"53","author":"S Ahmad","year":"2025","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"pcbi.1013397.ref034","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1162\/neco.1991.3.1.79","article-title":"Adaptive mixtures of local experts","volume":"3","author":"RA Jacobs","year":"1991","journal-title":"Neural Comput"},{"key":"pcbi.1013397.ref035","unstructured":"Dai D, Deng C, Zhao C, Xu RX, Gao H, Chen D, et al. DeepSeekMoE: towards ultimate expert specialization in mixture-of-experts language models [Preprint]. arXiv:2401.05639 [cs.CL]. 2024 Jan 11. Available from: https:\/\/arxiv.org\/abs\/2401.05639"},{"key":"pcbi.1013397.ref036","unstructured":"Huang Q, An Z, Zhuang N, Tao M, Zhang C, Jin Y, et al. Harder tasks need more experts: dynamic routing in MoE models [Preprint]. arXiv:2403.07679 [cs.LG]. 2024 Mar 12. Available from: https:\/\/arxiv.org\/abs\/2403.07679"},{"key":"pcbi.1013397.ref037","unstructured":"Liu H, Xia M, Gao T, Wang R, Chen D. Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers. arXiv [Preprint]. 2022 May 27. Available from: https:\/\/arxiv.org\/abs\/2205.14336"},{"key":"pcbi.1013397.ref038","doi-asserted-by":"crossref","unstructured":"Zhang Y, Cai R, Chen T, Zhang G, Zhang H, Chen P, et al. Robust mixture-of-expert training for convolutional neural networks [Preprint]. arXiv:2308.09751 [cs.CV]. 2023. Available from: https:\/\/arxiv.org\/abs\/2308.09751","DOI":"10.1109\/ICCV51070.2023.00015"},{"key":"pcbi.1013397.ref039","doi-asserted-by":"crossref","unstructured":"Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv. 2014. doi: 10.48550\/arXiv.1406.1078","DOI":"10.3115\/v1\/D14-1179"},{"key":"pcbi.1013397.ref040","unstructured":"Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN. Convolutional sequence to sequence learning [Preprint]. arXiv:1705.03122 [cs.CL]. 2017 May 8 [revised 2017 Jul 25; version 3]. Available from: https:\/\/arxiv.org\/abs\/1705.03122"},{"issue":"1","key":"pcbi.1013397.ref041","first-page":"89","article-title":"Design of a modified transformer architecture based on relative position coding","volume":"14","author":"Y Zhou","year":"2021","journal-title":"Int J Comput Intell Syst"},{"key":"pcbi.1013397.ref042","first-page":"4644","article-title":"Self-attention with relative position representations.","volume-title":"Proceedings of the 2018 Conference on Neural Information Processing Systems (NeurIPS 2018); 2018 Dec 2\u20138; Montr\u00e9al, Canada. Curran Associates","author":"P Shaw","year":"2018"},{"key":"pcbi.1013397.ref043","article-title":"BastionHub: a universal platform for integrating and analyzing substrates secreted by Gram-negative bacteria","volume":"49","author":"J Wang","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"pcbi.1013397.ref044","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1038\/s43586-024-00363-x","article-title":"Uniform manifold approximation and projection","volume":"4","author":"J Healy","year":"2024","journal-title":"Nature Reviews Methods Primers"},{"key":"pcbi.1013397.ref045","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/978-1-0716-3445-5_1","article-title":"Identification of Protein Secretion Systems in Bacterial Genomes Using MacSyFinder Version 2","volume":"2715","author":"SS Abby","year":"2024","journal-title":"Methods Mol Biol"},{"issue":"7","key":"pcbi.1013397.ref046","doi-asserted-by":"crossref","first-page":"2272","DOI":"10.1093\/bioinformatics\/btz921","article-title":"Logomaker: beautiful sequence logos in Python","volume":"36","author":"A Tareen","year":"2020","journal-title":"Bioinformatics"},{"issue":"1","key":"pcbi.1013397.ref047","first-page":"148","article-title":"Comprehensive assessment and performance improvement of effector protein predictors for bacterial secretion systems III, IV and VI","volume":"19","author":"Y An","year":"2018","journal-title":"Brief Bioinform"},{"key":"pcbi.1013397.ref048","first-page":"9689","article-title":"Evaluating Protein Transfer Learning with TAPE","volume":"32","author":"R Rao","year":"2019","journal-title":"Adv Neural Inf Process Syst"},{"issue":"11","key":"pcbi.1013397.ref049","doi-asserted-by":"crossref","first-page":"1162","DOI":"10.1016\/j.tim.2023.05.011","article-title":"Features and algorithms: facilitating investigation of secreted effectors in Gram-negative bacteria","volume":"31","author":"Z Zhao","year":"2023","journal-title":"Trends Microbiol"},{"issue":"2","key":"pcbi.1013397.ref050","doi-asserted-by":"crossref","DOI":"10.1128\/microbiolspec.PSIB-0003-2018","article-title":"Type I Secretion Systems-One Mechanism for All?","volume":"7","author":"O Spitz","year":"2019","journal-title":"Microbiol Spectr"},{"issue":"1","key":"pcbi.1013397.ref051","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1038\/nrmicro.2016.161","article-title":"Protein export through the bacterial Sec pathway","volume":"15","author":"A Tsirigotaki","year":"2017","journal-title":"Nat Rev Microbiol"},{"issue":"1","key":"pcbi.1013397.ref052","doi-asserted-by":"crossref","first-page":"75","DOI":"10.3390\/microorganisms13010075","article-title":"The Bacterial Type III Secretion System as a Broadly Applied Protein Delivery Tool in Biological Sciences","volume":"13","author":"L Jia","year":"2025","journal-title":"Microorganisms"},{"issue":"1","key":"pcbi.1013397.ref053","doi-asserted-by":"crossref","first-page":"2623","DOI":"10.1038\/s41467-020-16397-0","article-title":"Structural basis for effector protein recognition by the Dot\/Icm Type IVB coupling protein complex","volume":"11","author":"H Kim","year":"2020","journal-title":"Nat Commun"},{"key":"pcbi.1013397.ref054","doi-asserted-by":"crossref","first-page":"584751","DOI":"10.3389\/fcimb.2020.584751","article-title":"An Overview of Anti-Eukaryotic T6SS Effectors","volume":"10","author":"J Monjar\u00e1s Feria","year":"2020","journal-title":"Front Cell Infect Microbiol"},{"key":"pcbi.1013397.ref055","article-title":"SaProt: Protein language modeling with structure-aware vocabulary.","volume-title":"Proceedings of the International Conference on Learning Representations (ICLR 2024); 2024 May","author":"J Su","year":"2024"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1013397","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T00:00:00Z","timestamp":1773792000000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1013397","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T17:46:46Z","timestamp":1773856006000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1013397"}},"subtitle":[],"editor":[{"given":"Shugang","family":"Zhang","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2026,3,11]]},"references-count":55,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2026,3,11]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1013397","relation":{},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,11]]}}}