{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,31]],"date-time":"2026-01-31T06:34:56Z","timestamp":1769841296588,"version":"3.49.0"},"reference-count":74,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T00:00:00Z","timestamp":1768089600000},"content-version":"vor","delay-in-days":10,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Member States of the European Molecular Biology Laboratory"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,1,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Recombinant protein expression can be a limiting step in the production of protein reagents for drug discovery and other biotechnology applications. We introduce RP3Net (Recombinant Protein Production Prediction Network), an AI model of small-scale heterologous soluble protein expression in Escherichia coli. RP3Net utilizes the most recent protein and genomic foundational models. A curated dataset of internal experimental results from AstraZeneca and publicly available data from the Structural Genomics Consortium was used for training, validation and testing of RP3Net.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>RP3Net achieves an increase in area under the receiver operator curve (AUROC) of 0.15, compared to a baseline model. When experimentally validated on an independent, prospective, manually selected set of 97 constructs, RP3Net outperformed currently available models, with an AUROC of 0.83, delivering accurate predictions in 77% of the cases, and correctly identifying successfully expressing constructs in 92% of cases.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The model, along with installation and running instructions, is available under an MIT licence at https:\/\/github.com\/RP3Net\/RP3Net, DOI 10.5281\/zenodo.17243498.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btag003","type":"journal-article","created":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T12:50:18Z","timestamp":1767963018000},"source":"Crossref","is-referenced-by-count":0,"title":["RP3Net: a deep learning model for predicting recombinant protein production in\n                    <i>Escherichia coli<\/i>"],"prefix":"10.1093","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8580-9172","authenticated-orcid":false,"given":"Evgeny","family":"Tankhilevich","sequence":"first","affiliation":[{"name":"European Bioinformatics Institute (EMBL-EBI) , Wellcome Genome Campus , Hinxton, Cambridgeshire, CB10 1SD,","place":["United Kingdom"]}]},{"given":"Sergio","family":"Martinez Cuesta","sequence":"additional","affiliation":[{"name":"Astra-Zeneca Data Sciences and Quantitative Biology, Discovery Sciences, BioPharmaceuticals R&D, , Cambridge, CB2 0AA,","place":["United Kingdom"]}]},{"given":"Ian","family":"Barrett","sequence":"additional","affiliation":[{"name":"Astra-Zeneca Data Sciences and Quantitative Biology, Discovery Sciences, BioPharmaceuticals R&D, , Cambridge, CB2 0AA,","place":["United Kingdom"]}]},{"given":"Carolina","family":"Berg","sequence":"additional","affiliation":[{"name":"Astra-Zeneca Protein Science, Structure and Biophysics, Discovery Sciences, BioPharmaceuticals R&D, , M\u00f6lndal, 431 83,","place":["Sweden"]}]},{"given":"Lovisa","family":"Holmberg Schiavone","sequence":"additional","affiliation":[{"name":"Astra-Zeneca Protein Science, Structure and Biophysics, Discovery Sciences, BioPharmaceuticals R&D, , M\u00f6lndal, 431 83,","place":["Sweden"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8178-0253","authenticated-orcid":false,"given":"Andrew R","family":"Leach","sequence":"additional","affiliation":[{"name":"European Bioinformatics Institute (EMBL-EBI) , Wellcome Genome Campus , Hinxton, Cambridgeshire, CB10 1SD,","place":["United Kingdom"]}]}],"member":"286","published-online":{"date-parts":[[2026,1,11]]},"reference":[{"key":"2026013011071836900_btag003-B1","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/B978-0-12-381274-2.00002-9","article-title":"Preparation of protein samples for NMR structure, function, and small-molecule screening studies","volume":"493","author":"Acton","year":"2011","journal-title":"Methods Enzymol"},{"key":"2026013011071836900_btag003-B2","doi-asserted-by":"crossref","first-page":"1094","DOI":"10.1002\/bab.2600","article-title":"Optimizing recombinant antibody fragment production: a comparison of artificial intelligence and statistical modeling","volume":"71","author":"Basafa","year":"2024","journal-title":"Biotechnol Appl Biochem"},{"key":"2026013011071836900_btag003-B3","doi-asserted-by":"crossref","first-page":"D523","DOI":"10.1093\/nar\/gkac1052","article-title":"UniProt: the universal protein knowledgebase in 2023","volume":"51","author":"Bateman","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2026013011071836900_btag003-B4","doi-asserted-by":"crossref","first-page":"D36","DOI":"10.1093\/nar\/gks1195","article-title":"GenBank","volume":"41","author":"Benson","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2026013011071836900_btag003-B5","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2026013011071836900_btag003-B6","unstructured":"Berman HM, Gabanyi MJ, Kouranov A \u00a0et al \u00a0Protein Structure Initiative\u2014TargetTrack 2000-2017\u2014All Data Files. \u00a02017. 10.5281\/zenodo.821654."},{"key":"2026013011071836900_btag003-B7","doi-asserted-by":"crossref","first-page":"2102","DOI":"10.1093\/bioinformatics\/btac020","article-title":"ProteinBERT: a universal deep-learning model of protein sequence and function","volume":"38","author":"Brandes","year":"2022","journal-title":"Bioinformatics"},{"key":"2026013011071836900_btag003-B8","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1016\/j.pep.2008.01.008","article-title":"Codon optimization can improve expression of human genes in Escherichia coli: a multi-gene study","volume":"59","author":"Burgess-Brown","year":"2008","journal-title":"Protein Expr Purif"},{"key":"2026013011071836900_btag003-B9","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1007\/978-1-0716-0892-0_4","article-title":"Screening and production of recombinant human proteins: protein production in E. coli","volume":"2199","author":"Burgess-Brown","year":"2021","journal-title":"Methods Mol Biol"},{"key":"2026013011071836900_btag003-B10","doi-asserted-by":"crossref","first-page":"D488","DOI":"10.1093\/nar\/gkac1077","article-title":"RCSB protein data bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence\/machine learning","volume":"51","author":"Burley","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2026013011071836900_btag003-B11","first-page":"19746","author":"Buterez","year":"2022"},{"key":"2026013011071836900_btag003-B12","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1007\/978-1-0716-0892-0_7","article-title":"High-throughput expression screening in mammalian suspension cells","volume":"2199","author":"Chapple","year":"2021","journal-title":"Methods Mol Biol"},{"key":"2026013011071836900_btag003-B13","first-page":"11","volume-title":"Methods Mol Biol","author":"Cooper","year":"2017"},{"key":"2026013011071836900_btag003-B14","author":"Dallago","year":"2021"},{"key":"2026013011071836900_btag003-B15","doi-asserted-by":"publisher","article-title":"BERT: pre-training of deep bidirectional transformers for language understanding","author":"Devlin","DOI":"10.48550\/arXiv.1810.04805,"},{"key":"2026013011071836900_btag003-B16","doi-asserted-by":"crossref","first-page":"5640","DOI":"10.1038\/s41467-024-49777-x","article-title":"A data science roadmap for open science organizations engaged in early-stage drug discovery","volume":"15","author":"Edfeldt","year":"2024","journal-title":"Nat Commun"},{"key":"2026013011071836900_btag003-B17","doi-asserted-by":"crossref","first-page":"634","DOI":"10.1038\/s41570-025-00737-z","article-title":"Protein\u2013ligand data at scale to support machine learning","volume":"9","author":"Edwards","year":"2025","journal-title":"Nat Rev Chem"},{"key":"2026013011071836900_btag003-B18","doi-asserted-by":"crossref","first-page":"7112","DOI":"10.1109\/TPAMI.2021.3095381","article-title":"ProtTrans: toward understanding the language of life through self-supervised learning","volume":"44","author":"Elnaggar","year":"2022","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2026013011071836900_btag003-B19","first-page":"5.24.1","article-title":"Strategies to optimize protein expression in E. coli","volume":"5","author":"Francis","year":"2010","journal-title":"Curr Protoc Protein Sci"},{"key":"2026013011071836900_btag003-B20","doi-asserted-by":"publisher","first-page":"1189","DOI":"10.1214\/aos\/1013203451","article-title":"Greedy function approximation: A gradient boosting machine","volume":"29","author":"Friedman","year":"2001","journal-title":"Ann Stat"},{"key":"2026013011071836900_btag003-B21","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1007\/s10969-011-9106-2","article-title":"The structural biology knowledgebase: a portal to protein structures, sequences, functions, and methods","volume":"12","author":"Gabanyi","year":"2011","journal-title":"J Struct Funct Genomics"},{"key":"2026013011071836900_btag003-B22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.pep.2008.07.005","article-title":"Effective high-throughput overproduction of membrane proteins in Escherichia coli","volume":"62","author":"Gordon","year":"2008","journal-title":"Protein Expr Purif"},{"key":"2026013011071836900_btag003-B23","doi-asserted-by":"crossref","first-page":"210","DOI":"10.1016\/j.pep.2007.11.008","article-title":"The use of systematic N- and C-terminal deletions to promote production and structural studies of recombinant proteins","volume":"58","author":"Gr\u00e4slund","year":"2008","journal-title":"Protein Expr Purif"},{"key":"2026013011071836900_btag003-B24","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1016\/j.pep.2008.10.022","article-title":"Chaperone over-expression in Escherichia coli: apparent increased yields of soluble recombinant protein kinases are due mainly to soluble aggregates","volume":"64","author":"Haacke","year":"2009","journal-title":"Protein Expr Purif"},{"key":"2026013011071836900_btag003-B25","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1016\/j.pep.2008.07.007","article-title":"pCold-GST vector: a novel cold-shock vector containing GST tag for soluble protein production","volume":"62","author":"Hayashi","year":"2008","journal-title":"Protein Expr Purif"},{"key":"2026013011071836900_btag003-B26","doi-asserted-by":"crossref","first-page":"850","DOI":"10.1126\/science.ads0018","article-title":"Simulating 500 million years of evolution with a language model","volume":"387","author":"Hayes","year":"2025","journal-title":"Science"},{"key":"2026013011071836900_btag003-B27","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1093\/bioinformatics\/btaa1102","article-title":"SoluProt: prediction of soluble protein expression in Escherichia coli","volume":"37","author":"Hon","year":"2021","journal-title":"Bioinformatics"},{"key":"2026013011071836900_btag003-B28","doi-asserted-by":"crossref","first-page":"2112","DOI":"10.1093\/bioinformatics\/btab083","article-title":"DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome","volume":"37","author":"Ji","year":"2021","journal-title":"Bioinformatics"},{"key":"2026013011071836900_btag003-B29","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2026013011071836900_btag003-B30","doi-asserted-by":"crossref","first-page":"7475","DOI":"10.1007\/s00253-023-12805-9","article-title":"Development of a thermophilic host\u2013vector system for the production of recombinant proteins at elevated temperatures","volume":"107","author":"Kurashiki","year":"2023","journal-title":"Appl Microbiol Biotechnol"},{"key":"2026013011071836900_btag003-B31","first-page":"3744","author":"Lee","year":"2019"},{"key":"2026013011071836900_btag003-B32","first-page":"27351","author":"Li","year":"2024"},{"key":"2026013011071836900_btag003-B33","first-page":"1123","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Lin","year":"2023","journal-title":"Science (1979)"},{"key":"2026013011071836900_btag003-B34","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1007\/978-1-0716-0892-0_6","article-title":"Expression screening of human integral membrane proteins using BacMam","volume":"2199","author":"Mahajan","year":"2021","journal-title":"Methods Mol Biol"},{"key":"2026013011071836900_btag003-B35","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1007\/978-1-0716-0892-0_5","article-title":"Screening and production of recombinant human proteins: protein production in insect cells","volume":"2199","author":"Mahajan","year":"2021","journal-title":"Methods Mol Biol"},{"key":"2026013011071836900_btag003-B36","doi-asserted-by":"crossref","first-page":"i24","DOI":"10.1093\/bioinformatics\/btr229","article-title":"Sequence-based prediction of protein crystallization, purification and production propensity","volume":"27","author":"Mizianty","year":"2011","journal-title":"Bioinformatics"},{"key":"2026013011071836900_btag003-B37","doi-asserted-by":"crossref","first-page":"e0271403","DOI":"10.1371\/journal.pone.0271403","article-title":"A scalable screening of E. coli strains for recombinant protein expression","volume":"17","author":"Mor\u00e3o","year":"2022","journal-title":"PLoS One"},{"key":"2026013011071836900_btag003-B38","first-page":"43177","author":"Nguyen","year":"2023"},{"key":"2026013011071836900_btag003-B39","doi-asserted-by":"crossref","first-page":"D1353","DOI":"10.1093\/nar\/gkac1046","article-title":"The next-generation open targets platform: reimagined, redesigned, rebuilt","volume":"51","author":"Ochoa","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2026013011071836900_btag003-B40","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1038\/s42256-024-00791-0","article-title":"Codon language embeddings provide strong signals for use in protein engineering","volume":"6","author":"Outeiral","year":"2024","journal-title":"Nat Mach Intell"},{"key":"2026013011071836900_btag003-B42","first-page":"9689","article-title":"Evaluating protein transfer learning with TAPE","volume":"32","author":"Rao","year":"2019","journal-title":"Adv Neural Inf Process Syst"},{"key":"2026013011071836900_btag003-B43","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1186\/1472-6750-9-37","article-title":"Combined protein construct and synthetic gene engineering for heterologous protein expression and crystallization using gene composer","volume":"9","author":"Raymond","year":"2009","journal-title":"BMC Biotechnol"},{"key":"2026013011071836900_btag003-B44","doi-asserted-by":"crossref","first-page":"D753","DOI":"10.1093\/nar\/gkac1080","article-title":"MGnify: the microbiome sequence data analysis resource in 2023","volume":"51","author":"Richardson","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2026013011071836900_btag003-B45","doi-asserted-by":"crossref","first-page":"e2016239118","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"Rives","year":"2021","journal-title":"Proc Natl Acad Sci USA"},{"key":"2026013011071836900_btag003-B46","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.jsb.2010.06.008","article-title":"High-throughput production of human proteins for crystallization: the SGC experience","volume":"172","author":"Savitsky","year":"2010","journal-title":"J Struct Biol"},{"key":"2026013011071836900_btag003-B47","doi-asserted-by":"crossref","first-page":"D134","DOI":"10.1093\/nar\/gkad903","article-title":"GenBank 2024 update","volume":"52","author":"Sayers","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2026013011071836900_btag003-B48","doi-asserted-by":"crossref","first-page":"102572","DOI":"10.1016\/j.xpro.2023.102572","article-title":"A concise guide to choosing suitable gene expression systems for recombinant protein production","volume":"4","author":"Sch\u00fctz","year":"2023","journal-title":"STAR Protoc"},{"key":"2026013011071836900_btag003-B49","doi-asserted-by":"crossref","first-page":"9625","DOI":"10.1038\/s41598-022-13089-1","article-title":"Design of typical genes for heterologous gene expression","volume":"12","author":"Simm","year":"2022","journal-title":"Sci Rep"},{"key":"2026013011071836900_btag003-B50","doi-asserted-by":"crossref","first-page":"1201419","DOI":"10.3389\/fddsv.2023.1201419","article-title":"Drug discovery and development: introduction to the general public and patient groups","volume":"3","author":"Singh","year":"2023","journal-title":"Front Drug Discov"},{"key":"2026013011071836900_btag003-B51","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12934-019-1247-1","article-title":"Aliivibrio wodanis as a production host: development of genetic tools for expression of cold-active enzymes","volume":"18","author":"S\u00f6derberg","year":"2019","journal-title":"Microb Cell Fact"},{"key":"2026013011071836900_btag003-B52","doi-asserted-by":"crossref","first-page":"478","DOI":"10.1016\/j.jmb.2014.09.026","article-title":"The CamSol method of rational design of protein mutants with enhanced solubility","volume":"427","author":"Sormanni","year":"2015","journal-title":"J Mol Biol"},{"key":"2026013011071836900_btag003-B53","doi-asserted-by":"crossref","first-page":"8200","DOI":"10.1038\/s41598-017-07800-w","article-title":"Rapid and accurate in silico solubility screening of a monoclonal antibody library","volume":"7","author":"Sormanni","year":"2017","journal-title":"Sci Rep"},{"key":"2026013011071836900_btag003-B54","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1007\/978-1-0716-0892-0_3","article-title":"Screening and production of recombinant human proteins: ligation-independent cloning","volume":"2199","author":"Strain-Damerell","year":"2021","journal-title":"Methods Mol Biol"},{"key":"2026013011071836900_btag003-B55","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1038\/nmeth.f.202","article-title":"Protein production and purification","volume":"5","author":"Structural Genomics Consortium, China Structural Genomics Consortium, Northeast Structural Genomics Consortium","year":"2008","journal-title":"Nat Methods"},{"key":"2026013011071836900_btag003-B56","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/0167-7799(85)90068-X","article-title":"Purification of proteins by IMAC","volume":"3","author":"Sulkowski","year":"1985","journal-title":"Trends Biotechnol"},{"key":"2026013011071836900_btag003-B57","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1093\/bioinformatics\/btu739","article-title":"UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches","volume":"31","author":"Suzek","year":"2015","journal-title":"Bioinformatics"},{"key":"2026013011071836900_btag003-B58","doi-asserted-by":"publisher","author":"Taraday","year":"2023","DOI":"10.1109\/ICCV51070.2023.01493"},{"key":"2026013011071836900_btag003-B41","article-title":"Protein production data from the SGC","volume-title":"BioStudies","author":"The SGC Consortium","year":"2022"},{"key":"2026013011071836900_btag003-B59","doi-asserted-by":"crossref","first-page":"941","DOI":"10.1093\/bioinformatics\/btab801","article-title":"NetSolP: predicting protein solubility in Escherichia coli using language models","volume":"38","author":"Thumuluri","year":"2022","journal-title":"Bioinformatics"},{"key":"2026013011071836900_btag003-B60","first-page":"5999","article-title":"Attention is all you need","volume":"2017","author":"Vaswani","year":"2017","journal-title":"Adv Neural Inf Process Syst"},{"key":"2026013011071836900_btag003-B61","doi-asserted-by":"crossref","first-page":"21383","DOI":"10.1038\/srep21383","article-title":"Crysalis: an integrated server for computational analysis and design of protein crystallization","volume":"6","author":"Wang","year":"2016","journal-title":"Sci Rep"},{"key":"2026013011071836900_btag003-B62","doi-asserted-by":"crossref","first-page":"e105902","DOI":"10.1371\/journal.pone.0105902","article-title":"PredPPCrys: accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection","volume":"9","author":"Wang","year":"2014","journal-title":"PLoS One"},{"key":"2026013011071836900_btag003-B63","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbac352","article-title":"SADeepcry: a deep learning framework for protein crystallization propensity prediction using self-attention and auto-encoder networks","volume":"23","author":"Wang","year":"2022","journal-title":"Brief Bioinform"},{"key":"2026013011071836900_btag003-B64","doi-asserted-by":"crossref","first-page":"160018","DOI":"10.1038\/sdata.2016.18","article-title":"The FAIR guiding principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Sci Data"},{"key":"2026013011071836900_btag003-B65","doi-asserted-by":"crossref","first-page":"802","DOI":"10.1007\/s12539-024-00639-6","article-title":"PLMC: language model of protein sequences enhances protein crystallization prediction","volume":"16","author":"Xiong","year":"2024","journal-title":"Interdiscip Sci"},{"key":"2026013011071836900_btag003-B66","doi-asserted-by":"crossref","first-page":"W248","DOI":"10.1093\/nar\/gkae381","article-title":"GPSFun: geometry-aware protein sequence function predictions with language models","volume":"52","author":"Yuan","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2026013011071836900_btag003-B67","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-030-57814-5","volume-title":"The Science and Business of Drug Discovery: Demystifying the Jargon","author":"Zanders","year":"2020","edition":"2nd"},{"key":"2026013011071836900_btag003-B68","doi-asserted-by":"crossref","first-page":"D1180","DOI":"10.1093\/nar\/gkad1004","article-title":"The ChEMBL database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods","volume":"52","author":"Zdrazil","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2026013011071836900_btag003-B69","doi-asserted-by":"crossref","first-page":"856049","DOI":"10.3389\/fbioe.2022.856049","article-title":"Strategies and considerations for improving recombinant antibody production and Squality in Chinese hamster ovary cells","volume":"10","author":"Zhang","year":"2022","journal-title":"Front Bioeng Biotechnol"},{"key":"2026013011071836900_btag003-B70","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/bib\/bbae404","article-title":"PLM_Sol: predicting protein solubility by benchmarking multiple protein language models with the updated escherichia coli protein solubility dataset","volume":"25","author":"Zhang","year":"2024","journal-title":"Brief Bioinform"},{"key":"2026013011071836900_btag003-B71","first-page":"11053","article-title":"Meta label correction for noisy label learning","author":"Zheng","year":"2021"},{"key":"2026013011071836900_btag003-B72","doi-asserted-by":"crossref","first-page":"e0139695","DOI":"10.1371\/journal.pone.0139695","article-title":"Optimizing production of antigens and fabs in the context of generating recombinant antibodies to human proteins","volume":"10","author":"Zhong","year":"2015","journal-title":"PLoS One"},{"key":"2026013011071836900_btag003-B73","author":"Zhou","year":"2023"},{"key":"2026013011071836900_btag003-B74","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/bib\/bbaa076","article-title":"Accurate multistage prediction of protein crystallization propensity using deep-Cascade Forest with sequence-based features","volume":"22","author":"Zhu","year":"2021","journal-title":"Brief Bioinform"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btag003\/66342286\/btag003.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/1\/btag003\/66342286\/btag003.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/1\/btag003\/66342286\/btag003.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T16:07:30Z","timestamp":1769789250000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btag003\/8419965"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2026,1]]},"references-count":74,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btag003","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,1]]},"published":{"date-parts":[[2026,1]]},"article-number":"btag003"}}