{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T22:54:44Z","timestamp":1774738484204,"version":"3.50.1"},"reference-count":49,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T00:00:00Z","timestamp":1768867200000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Basic Research Program of Jiangsu","award":["BK20241816"],"award-info":[{"award-number":["BK20241816"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["52101023"],"award-info":[{"award-number":["52101023"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Research Development Fund of Xi\u2019an Jiaotong-Liverpool University","award":["RDF-23-01-073"],"award-info":[{"award-number":["RDF-23-01-073"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,2,28]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Short peptides hold significant promise in drug discovery and materials science due to their biocompatibility, multifunctionality, ease of synthesis, etc. However, accurately predicting their physicochemical properties, a prerequisite for application development, remains a grand challenge due to the sheet quantity of peptides.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>This study presents an innovative approach integrating uniform design (UD) on the sampling over the whole space with artificial intelligence (AI) on the sampled data to enhance prediction of key physicochemical properties, including aggregation propensity (AP), hydrophilicity (logP), and isoelectric point (pI), within the complete sequence space of tetrapeptides (160\u00a0000 sequences). Using UD, we generate 31 distinct peptide datasets, with a consistent amino acid occupation fraction of 5% at each position, thereby creating unbiased training data without any amino acid preferences for training AI models. This work provides comprehensive datasets on the physicochemical properties of all tetrapeptides, develops robust AI-based predictive models, and quantitatively elucidates the relationships between key physicochemical attributes and self-assembly behaviors of short peptides by Shapley Additive Explanations (SHAP) analysis. By integrating the strategic experimental design (i.e. UD), AI modeling, and peptide domain knowledge, our approach facilitates the discovery and optimization of functional peptides, offering new opportunities for peptide-based therapeutic applications.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The complete datasets, source code, and pretrained models are made available at the Github repository (https:\/\/github.com\/JiaqiBenWang\/UD-AI-Peptide) and Zenodo (https:\/\/doi.org\/10.5281\/zenodo.17984124).<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btag036","type":"journal-article","created":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T12:48:48Z","timestamp":1768481328000},"source":"Crossref","is-referenced-by-count":0,"title":["Uniform design-embedded predictions of (tetra-)peptide physicochemical properties"],"prefix":"10.1093","volume":"42","author":[{"given":"Zhihui","family":"Zhu","sequence":"first","affiliation":[{"name":"ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University , Hangzhou, Zhejiang 311215,","place":["China"]}]},{"given":"Huapeng","family":"Liu","sequence":"additional","affiliation":[{"name":"Wisdom Lake Academy of Pharmacy, Xi\u2019an Jiaotong-Liverpool University , Suzhou, Jiangsu 215123,","place":["China"]}]},{"given":"Xuechen","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Chemistry, State Key Laboratory of Synthetic Chemistry, The University of Hong Kong , Pokfulam, Hong Kong SAR 999077,","place":["China"]}]},{"given":"Haojin","family":"Zhou","sequence":"additional","affiliation":[{"name":"Wisdom Lake Academy of Pharmacy, Xi\u2019an Jiaotong-Liverpool University , Suzhou, Jiangsu 215123,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5045-1497","authenticated-orcid":false,"given":"Jiaqi","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Chemistry, State Key Laboratory of Synthetic Chemistry, The University of Hong Kong , Pokfulam, Hong Kong SAR 999077,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2026,1,19]]},"reference":[{"key":"2026032818393184600_btag036-B1","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1016\/j.softx.2015.06.001","article-title":"GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers","volume":"1\u20132","author":"Abraham","year":"2015","journal-title":"SoftwareX"},{"key":"2026032818393184600_btag036-B2","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1016\/j.addr.2016.08.006","article-title":"Self-assembling peptide-based building blocks in medical applications","volume":"110\u2013111","author":"Acar","year":"2017","journal-title":"Adv Drug Deliv Rev"},{"key":"2026032818393184600_btag036-B3","doi-asserted-by":"crossref","first-page":"927","DOI":"10.1021\/acsnano.9b08209","article-title":"A near-infrared peptide probe with tumor-specific excretion-retarded effect for image-guided surgery of renal cell carcinoma","volume":"14","author":"An","year":"2020","journal-title":"ACS Nano"},{"key":"2026032818393184600_btag036-B4","doi-asserted-by":"crossref","first-page":"1427","DOI":"10.1038\/s41557-022-01055-3","article-title":"Machine learning overcomes human bias in the discovery of self-assembling peptides","volume":"14","author":"Batra","year":"2022","journal-title":"Nat Chem"},{"key":"2026032818393184600_btag036-B5","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1137\/16M1080173","article-title":"Optimization methods for large-scale machine learning","volume":"60","author":"Bottou","year":"2018","journal-title":"SIAM Rev"},{"key":"2026032818393184600_btag036-B6","doi-asserted-by":"crossref","first-page":"102160","DOI":"10.1016\/j.nantod.2024.102160","article-title":"Self-assembly of peptides: the acceleration by molecular dynamics simulations and machine learning","volume":"55","author":"Cao","year":"2024","journal-title":"Nano Today"},{"key":"2026032818393184600_btag036-B7","first-page":"785","author":"Chen","year":"2016"},{"key":"2026032818393184600_btag036-B8","doi-asserted-by":"crossref","first-page":"1611","DOI":"10.1038\/s41467-024-45766-2","article-title":"Design of target specific peptide inhibitors using generative deep learning and molecular dynamics simulations","volume":"15","author":"Chen","year":"2024","journal-title":"Nat Commun"},{"key":"2026032818393184600_btag036-B9","doi-asserted-by":"crossref","first-page":"687","DOI":"10.1021\/ct300646g","article-title":"Improved parameters for the martini coarse-grained protein force field","volume":"9","author":"De Jong","year":"2013","journal-title":"J Chem Theory Comput"},{"key":"2026032818393184600_btag036-B10","first-page":"363","article-title":"Experimental design by uniform distribution","volume":"3","author":"Fan","year":"1980","journal-title":"Acta Math Appl Sin"},{"key":"2026032818393184600_btag036-B11","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1080\/00401706.2000.10486045","article-title":"Uniform design: theory and application","volume":"42","author":"Fang","year":"2000","journal-title":"Technometrics"},{"key":"2026032818393184600_btag036-B12","first-page":"131","volume-title":"Handbook of Statistics","author":"Fang","year":"2003"},{"key":"2026032818393184600_btag036-B13","doi-asserted-by":"publisher","author":"Fang","year":"2005","DOI":"10.1201\/9781420034899"},{"key":"2026032818393184600_btag036-B14","doi-asserted-by":"crossref","first-page":"2380","DOI":"10.1021\/jz2010573","article-title":"Virtual screening for dipeptide aggregation: toward predictive tools for peptide self-assembly","volume":"2","author":"Frederix","year":"2011","journal-title":"J Phys Chem Lett"},{"key":"2026032818393184600_btag036-B15","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1109\/5254.708428","article-title":"Support vector machines","volume":"13","author":"Hearst","year":"1998","journal-title":"IEEE Intell Syst"},{"key":"2026032818393184600_btag036-B16","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1090\/S0025-5718-98-00894-1","article-title":"A generalized discrepancy and quadrature error bound","volume":"67","author":"Hickernell","year":"1998","journal-title":"Math Comp"},{"key":"2026032818393184600_btag036-B17","doi-asserted-by":"crossref","first-page":"832","DOI":"10.1109\/34.709601","article-title":"he random subspace method for constructing decision forests","volume":"20","author":"Ho","year":"1998","journal-title":"IEEE Trans Pattern Anal Machine Intell"},{"key":"2026032818393184600_btag036-B18","doi-asserted-by":"crossref","first-page":"2509","DOI":"10.1038\/s41467-023-38056-w","article-title":"Machine learning-driven multifunctional peptide engineering for sustained ocular drug delivery","volume":"14","author":"Hsueh","year":"2023","journal-title":"Nat Commun"},{"key":"2026032818393184600_btag036-B19","article-title":"Lightgbm: a highly efficient gradient boosting decision tree","volume":"30","author":"Ke","year":"2017","journal-title":"Adv Neural Inf Process Syst"},{"key":"2026032818393184600_btag036-B20","doi-asserted-by":"crossref","first-page":"W285","DOI":"10.1093\/nar\/gkab295","article-title":"IPC 2.0: prediction of isoelectric point and p K a dissociation constants","volume":"49","author":"Kozlowski","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2026032818393184600_btag036-B21","doi-asserted-by":"crossref","first-page":"3532","DOI":"10.1016\/j.biomaterials.2009.03.018","article-title":"Osteoblastic differentiation of human bone marrow stromal cells in self-assembled BMP-2 receptor-binding peptide-amphiphiles","volume":"30","author":"Lee","year":"2009","journal-title":"Biomaterials"},{"key":"2026032818393184600_btag036-B22","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1038\/s41570-020-0215-y","article-title":"Biomimetic peptide self-assembly for functional materials","volume":"4","author":"Levin","year":"2020","journal-title":"Nat Rev Chem"},{"key":"2026032818393184600_btag036-B23","doi-asserted-by":"crossref","first-page":"2592","DOI":"10.1002\/anie.201511276","article-title":"Polyoxometalate-driven self-assembly of short peptides into multivalent nanofibers with enhanced antibacterial activity","volume":"55","author":"Li","year":"2016","journal-title":"Angew Chem Int Ed Engl"},{"key":"2026032818393184600_btag036-B24","doi-asserted-by":"crossref","first-page":"bbad409","DOI":"10.1093\/bib\/bbad409","article-title":"Efficient prediction of peptide self-assembly through sequential and graphical encoding","volume":"24","author":"Liu","year":"2023","journal-title":"Brief Bioinform"},{"key":"2026032818393184600_btag036-B25","doi-asserted-by":"crossref","first-page":"4805","DOI":"10.1016\/j.biomaterials.2014.02.047","article-title":"Ultrashort peptide nanofibrous hydrogels for the acceleration of healing of burn wounds","volume":"35","author":"Loo","year":"2014","journal-title":"Biomaterials"},{"key":"2026032818393184600_btag036-B26","article-title":"A unified approach to interpreting model predictions","volume":"30","author":"Lundberg","year":"2017","journal-title":"Adv Neural Inf Process Syst"},{"key":"2026032818393184600_btag036-B27","doi-asserted-by":"crossref","first-page":"7812","DOI":"10.1021\/jp071097f","article-title":"The MARTINI force field: coarse grained model for biomolecular simulations","volume":"111","author":"Marrink","year":"2007","journal-title":"J Phys Chem B"},{"key":"2026032818393184600_btag036-B28","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1016\/j.cossms.2011.08.001","article-title":"Peptide self-assembly for crafting functional biological materials","volume":"15","author":"Matson","year":"2011","journal-title":"Curr Opin Solid State Mater Sci"},{"key":"2026032818393184600_btag036-B29","doi-asserted-by":"crossref","first-page":"819","DOI":"10.1021\/ct700324x","article-title":"The MARTINI coarse-grained force field: extension to proteins","volume":"4","author":"Monticelli","year":"2008","journal-title":"J Chem Theory Comput"},{"key":"2026032818393184600_btag036-B30","doi-asserted-by":"crossref","first-page":"1487","DOI":"10.1038\/s42256-024-00928-1","article-title":"Reshaping the discovery of self-assembling peptides with generative AI guided by hybrid deep learning","volume":"6","author":"Njirjak","year":"2024","journal-title":"Nat Mach Intell"},{"key":"2026032818393184600_btag036-B31","author":"Osorio","year":"2015"},{"key":"2026032818393184600_btag036-B32","author":"Powers","year":"2020"},{"key":"2026032818393184600_btag036-B33","article-title":"CatBoost: unbiased boosting with categorical features","volume":"31","author":"Prokhorenkova","year":"2018","journal-title":"Adv Neural Inf Process Syst"},{"key":"2026032818393184600_btag036-B34","author":"Righetti","year":"2000"},{"key":"2026032818393184600_btag036-B35","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1007\/s10115-013-0679-x","article-title":"Explaining prediction models and individual predictions with feature contributions","volume":"41","author":"\u0160trumbelj","year":"2014","journal-title":"Knowl Inf Syst"},{"key":"2026032818393184600_btag036-B36","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv Neural Inf Process Syst"},{"key":"2026032818393184600_btag036-B37","doi-asserted-by":"crossref","first-page":"8607","DOI":"10.1039\/C7TB01883E","article-title":"Self-assembled RGD dehydropeptide hydrogels for drug delivery applications","volume":"5","author":"Vila\u00e7a","year":"2017","journal-title":"J Mater Chem B"},{"key":"2026032818393184600_btag036-B38","doi-asserted-by":"crossref","first-page":"2301544","DOI":"10.1002\/advs.202301544","article-title":"Deep learning empowers the discovery of self-assembling peptides with over 10 trillion sequences","volume":"10","author":"Wang","year":"2023","journal-title":"Adv Sci"},{"key":"2026032818393184600_btag036-B39","doi-asserted-by":"crossref","first-page":"3567","DOI":"10.1021\/jacsau.4c00501","article-title":"Aggregation rules of short peptides","volume":"4","author":"Wang","year":"2024","journal-title":"JACS Au"},{"key":"2026032818393184600_btag036-B40","doi-asserted-by":"crossref","first-page":"791","DOI":"10.1007\/s00778-022-00775-9","article-title":"Data collection and quality challenges in deep learning: a data-centric AI perspective","volume":"32","author":"Whang","year":"2023","journal-title":"VLDB J"},{"key":"2026032818393184600_btag036-B41","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1016\/S0304-4157(98)00021-5","article-title":"Hydrophobic interactions of peptides with membrane interfaces","volume":"1376","author":"White","year":"1998","journal-title":"Biochim Biophys Acta"},{"key":"2026032818393184600_btag036-B42","doi-asserted-by":"crossref","first-page":"842","DOI":"10.1038\/nsb1096-842","article-title":"Experimentally determined hydrophobicity scale for proteins at membrane interfaces","volume":"3","author":"Wimley","year":"1996","journal-title":"Nat Struct Biol"},{"key":"2026032818393184600_btag036-B43","doi-asserted-by":"crossref","first-page":"1857","DOI":"10.1093\/bioinformatics\/btv042","article-title":"protr\/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences","volume":"31","author":"Xiao","year":"2015","journal-title":"Bioinformatics"},{"key":"2026032818393184600_btag036-B44","doi-asserted-by":"crossref","first-page":"3880","DOI":"10.1038\/s41467-023-39648-2","article-title":"Accelerating the prediction and discovery of peptide hydrogels with human-in-the-loop","volume":"14","author":"Xu","year":"2023","journal-title":"Nat Commun"},{"key":"2026032818393184600_btag036-B45","doi-asserted-by":"crossref","first-page":"e202500163","DOI":"10.1002\/aidi.202500163","article-title":"Sampling strategy: an overlooked factor affecting artificial intelligence prediction accuracy of peptides\u2019 physicochemical properties","author":"Yan","year":"2025","journal-title":"Adv Intell Discov"},{"key":"2026032818393184600_btag036-B46","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1021\/acsabm.0c00707","article-title":"Self-assembled peptide drug delivery systems","volume":"4","author":"Yang","year":"2021","journal-title":"ACS Appl Bio Mater"},{"key":"2026032818393184600_btag036-B47","article-title":"UniDOE: uniform design of experiments","author":"Zhang","journal-title":"R Package Version 1.0.2. Vienna, Austria: The Comprehensive R"},{"key":"2026032818393184600_btag036-B48","doi-asserted-by":"crossref","first-page":"101645","DOI":"10.1016\/j.cocis.2022.101645","article-title":"Computational approaches for understanding and predicting the self-assembled peptide hydrogels","volume":"62","author":"Zhou","year":"2022","journal-title":"Curr Opin Colloid Interface Sci"},{"key":"2026032818393184600_btag036-B49","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1016\/j.jco.2012.11.006","article-title":"Mixture discrepancy for quasi-random point sets","volume":"29","author":"Zhou","year":"2013","journal-title":"J Complex"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btag036\/66464658\/btag036.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/3\/btag036\/66464658\/btag036.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/3\/btag036\/66464658\/btag036.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T22:39:40Z","timestamp":1774737580000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btag036\/8430293"}},"subtitle":[],"editor":[{"given":"Jianlin","family":"Cheng","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2026,1,19]]},"references-count":49,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,2,28]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btag036","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,3]]},"published":{"date-parts":[[2026,1,19]]},"article-number":"btag036"}}