{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T13:59:56Z","timestamp":1775656796390,"version":"3.50.1"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2024,11,9]],"date-time":"2024-11-09T00:00:00Z","timestamp":1731110400000},"content-version":"vor","delay-in-days":8,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000266","name":"UK Engineering and Physical Sciences Research Council","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Summary<\/jats:title>\n                    <jats:p>A key challenge in antibody drug discovery is designing novel sequences that are free from developability issues\u2014such as aggregation, polyspecificity, poor expression, or low solubility. Here, we present p-IgGen, a protein language model for paired heavy-light chain antibody generation. The model generates diverse, antibody-like sequences with pairing properties found in natural antibodies. We also create a finetuned version of p-IgGen that biases the model to generate antibodies with 3D biophysical properties that fall within distributions seen in clinical-stage therapeutic antibodies.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The model and inference code are freely available at www.github.com\/oxpig\/p-IgGen. Cleaned training data are deposited at doi.org\/10.5281\/zenodo.13880874.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae659","type":"journal-article","created":{"date-parts":[[2024,11,8]],"date-time":"2024-11-08T07:24:16Z","timestamp":1731050656000},"source":"Crossref","is-referenced-by-count":19,"title":["p-IgGen: a paired antibody generative language model"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0239-1207","authenticated-orcid":false,"given":"Oliver M","family":"Turnbull","sequence":"first","affiliation":[{"name":"Department of Statistics, University of Oxford , Oxford, OX1 3LB,","place":["United Kingdom"]}]},{"given":"Dino","family":"Oglic","sequence":"additional","affiliation":[{"name":"Centre for AI, Biopharmaceuticals R&D, AstraZeneca , Cambridge, CB2 0AA,","place":["United Kingdom"]}]},{"given":"Rebecca","family":"Croasdale-Wood","sequence":"additional","affiliation":[{"name":"Biologics Engineering, Oncology R&D, AstraZeneca , Cambridge, CB2 0AA,","place":["United Kingdom"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1388-2252","authenticated-orcid":false,"given":"Charlotte M","family":"Deane","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of Oxford , Oxford, OX1 3LB,","place":["United Kingdom"]}]}],"member":"286","published-online":{"date-parts":[[2024,11,9]]},"reference":[{"key":"2024112005460935700_btae659-B1","doi-asserted-by":"publisher","first-page":"575","DOI":"10.1038\/s42003-023-04927-7","article-title":"ImmuneBuilder: deep-learning models for predicting the structures of immune proteins","volume":"6","author":"Abanades","year":"2023","journal-title":"Commun Biol"},{"key":"2024112005460935700_btae659-B2","author":"Brown","year":"2020"},{"key":"2024112005460935700_btae659-B3","author":"Chinery","year":"2024"},{"key":"2024112005460935700_btae659-B4","doi-asserted-by":"publisher","first-page":"55","DOI":"10.3390\/antib8040055","article-title":"Antibody structure and function: the basis for engineering therapeutics","volume":"8","author":"Chiu","year":"2019","journal-title":"Antibodies"},{"key":"2024112005460935700_btae659-B5","author":"Chungyoun","year":"2024"},{"key":"2024112005460935700_btae659-B6","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1126\/science.add2187","article-title":"Robust deep learning\u2013based protein sequence design using ProteinMPNN","volume":"378","author":"Dauparas","year":"2022","journal-title":"Science"},{"key":"2024112005460935700_btae659-B7","doi-asserted-by":"publisher","first-page":"298","DOI":"10.1093\/bioinformatics\/btv552","article-title":"ANARCI: antigen receptor numbering and receptor classification","volume":"32","author":"Dunbar","year":"2015","journal-title":"Bioinformatics"},{"key":"2024112005460935700_btae659-B8","doi-asserted-by":"publisher","first-page":"4348","DOI":"10.1038\/s41467-022-32007-7","article-title":"ProtGPT2 is a deep unsupervised language model for protein design","volume":"13","author":"Ferruz","year":"2022","journal-title":"Nat Commun"},{"key":"2024112005460935700_btae659-B9","author":"Hayes","year":"2024"},{"key":"2024112005460935700_btae659-B10","doi-asserted-by":"publisher","first-page":"275","DOI":"10.1038\/s41587-023-01763-2","article-title":"Efficient evolution of human antibodies from general protein language models","volume":"42","author":"Hie","year":"2023","journal-title":"Nat Biotechnol"},{"key":"2024112005460935700_btae659-B11","author":"Hsu","year":"2022"},{"key":"2024112005460935700_btae659-B12","doi-asserted-by":"publisher","first-page":"944","DOI":"10.1073\/pnas.1616408114","article-title":"Biophysical properties of the clinical-stage antibody landscape","volume":"114","author":"Jain","year":"2017","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024112005460935700_btae659-B13","doi-asserted-by":"publisher","first-page":"E486","DOI":"10.1073\/pnas.1613231114","article-title":"Mutational landscape of antibody variable domains reveals a switch modulating the interdomain conformational dynamics and antigen binding","volume":"114","author":"Koenig","year":"2017","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024112005460935700_btae659-B14","doi-asserted-by":"publisher","first-page":"1123","DOI":"10.1126\/science.ade2574","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Lin","year":"2023","journal-title":"Science"},{"key":"2024112005460935700_btae659-B15","doi-asserted-by":"publisher","first-page":"4041","DOI":"10.1093\/bioinformatics\/btab434","article-title":"Humanization of antibodies using a machine learning approach on large-scale repertoire data","volume":"37","author":"Marks","year":"2021","journal-title":"Bioinformatics"},{"key":"2024112005460935700_btae659-B16","author":"Meier","year":"2021"},{"key":"2024112005460935700_btae659-B17","doi-asserted-by":"publisher","DOI":"10.1016\/j.cels.2023.10.002","volume-title":"Cell Syst","author":"Nijkamp","year":"2023"},{"key":"2024112005460935700_btae659-B18","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1002\/pro.4205","article-title":"Observed antibody space: a diverse database of cleaned, annotated, and translated unpaired and paired antibody sequences","volume":"31","author":"Olsen","year":"2022","journal-title":"Protein Sci"},{"key":"2024112005460935700_btae659-B19","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btae618","volume-title":"Bioinformatics","author":"Olsen","year":"2024"},{"key":"2024112005460935700_btae659-B20","doi-asserted-by":"publisher","first-page":"4025","DOI":"10.1073\/pnas.1810576116","article-title":"Five computational developability guidelines for therapeutic antibody profiling","volume":"116","author":"Raybould","year":"2019","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024112005460935700_btae659-B21","doi-asserted-by":"publisher","first-page":"62","DOI":"10.1038\/s42003-023-05744-8","article-title":"Contextualising the developability risk of antibodies with lambda light chains using enhanced therapeutic antibody profiling","volume":"7","author":"Raybould","year":"2024","journal-title":"Commun Biol"},{"key":"2024112005460935700_btae659-B22","author":"Ruffolo","year":"2021"},{"key":"2024112005460935700_btae659-B23","doi-asserted-by":"publisher","first-page":"2403","DOI":"10.1038\/s41467-021-22732-w","article-title":"Protein design and variant prediction using autoregressive generative models","volume":"12","author":"Shin","year":"2021","journal-title":"Nat Commun"},{"key":"2024112005460935700_btae659-B24","doi-asserted-by":"publisher","first-page":"979","DOI":"10.1016\/j.cels.2023.10.001","article-title":"IgLM: infilling language modeling for antibody sequence design","volume":"14","author":"Shuai","year":"2023","journal-title":"Cell Syst"},{"key":"2024112005460935700_btae659-B25","doi-asserted-by":"publisher","first-page":"127063","DOI":"10.1016\/j.neucom.2023.127063","article-title":"RoFormer: enhanced transformer with rotary position embedding","volume":"568","author":"Su","year":"2024","journal-title":"Neurocomputing"},{"key":"2024112005460935700_btae659-B26","doi-asserted-by":"publisher","first-page":"2213793","DOI":"10.1080\/19420862.2023.2213793","article-title":"Evolution of phage display libraries for therapeutic antibody discovery","volume":"15","author":"Zhang","year":"2023","journal-title":"MAbs"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae659\/60578042\/btae659.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/11\/btae659\/60752212\/btae659.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/11\/btae659\/60752212\/btae659.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,20]],"date-time":"2024-11-20T00:46:22Z","timestamp":1732063582000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae659\/7888884"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,11,1]]},"references-count":26,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2024,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae659","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.08.06.606780","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,11]]},"published":{"date-parts":[[2024,11,1]]},"article-number":"btae659"}}