{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,15]],"date-time":"2025-08-15T02:46:44Z","timestamp":1755226004291,"version":"3.43.0"},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2025,7,29]],"date-time":"2025-07-29T00:00:00Z","timestamp":1753747200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Japan Society for the Promotion of Science","doi-asserted-by":"publisher","award":["JP24H00737","JP22H04925"],"award-info":[{"award-number":["JP24H00737","JP22H04925"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,8,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>RNA plays a crucial role in cellular functions, and designing functional RNA sequences is essential for both scientific exploration and bioengineering applications. Conventional RNA design approaches typically assume a shared secondary structure among designed sequences. However, even closely related RNAs can adopt different secondary structures, particularly when artificial mutations are introduced.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We present a novel deep generative model that integrates context-free grammar (CFG) with a variational autoencoder (VAE) to generate RNA sequences while explicitly considering their individual secondary structures. In our method, RNA sequences and their structures are represented as parse trees based on CFG, which are then transformed into binary matrices for VAE training. The optimal parse tree is reconstructed using dynamic programming, ensuring structure-aware sequence generation. When evaluated on natural RNAs from the Rfam database, our model successfully generates high-quality RNA sequences. Furthermore, when applied to RNA aptazyme mutants with distinct secondary structures, our method reveals a strong correlation between the latent space representation of the VAE and self-cleaving activity. This underscores the importance of incorporating RNA-specific structural information in generative models.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>https:\/\/github.com\/gterai\/RNAgg (archived at Zenodo: https:\/\/doi.org\/10.5281\/zenodo.15354990).<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf427","type":"journal-article","created":{"date-parts":[[2025,7,29]],"date-time":"2025-07-29T16:33:11Z","timestamp":1753806791000},"source":"Crossref","is-referenced-by-count":0,"title":["Deep generative model of RNAs based on variational autoencoder with context-free grammar"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1059-2519","authenticated-orcid":false,"given":"Goro","family":"Terai","sequence":"first","affiliation":[{"name":"Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo , Chiba 277-8561,","place":["Japan"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0909-4982","authenticated-orcid":false,"given":"Kiyoshi","family":"Asai","sequence":"additional","affiliation":[{"name":"Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo , Chiba 277-8561,","place":["Japan"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2025,7,29]]},"reference":[{"key":"2025081218212335900_btaf427-B1","first-page":"350","article-title":"Design of RNAs: comparing programs for inverse RNA folding","volume":"19","author":"Churkin","year":"2018","journal-title":"Brief Bioinform"},{"key":"2025081218212335900_btaf427-B2","doi-asserted-by":"crossref","first-page":"597","DOI":"10.1146\/annurev.biochem.69.1.597","article-title":"Ribozyme structures and mechanisms","volume":"69","author":"Doherty","year":"2000","journal-title":"Annu Rev Biochem"},{"key":"2025081218212335900_btaf427-B3","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1186\/1471-2105-5-71","article-title":"Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction","volume":"5","author":"Dowell","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2025081218212335900_btaf427-B4","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1021\/acscentsci.7b00572","article-title":"Automatic chemical design using a data-driven continuous representation of molecules","volume":"4","author":"G\u00f3mez-Bombarelli","year":"2018","journal-title":"ACS Cent Sci"},{"key":"2025081218212335900_btaf427-B5","doi-asserted-by":"crossref","first-page":"D192","DOI":"10.1093\/nar\/gkaa1047","article-title":"Rfam 14: expanded coverage of metagenomic, viral and microRNA families","volume":"49","author":"Kalvari","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2025081218212335900_btaf427-B6","doi-asserted-by":"crossref","first-page":"3377","DOI":"10.1093\/bioinformatics\/btv372","article-title":"Forna (force-directed RNA): simple and effective online RNA secondary structure diagrams","volume":"31","author":"Kerpedjiev","year":"2015","journal-title":"Bioinformatics"},{"key":"2025081218212335900_btaf427-B7","doi-asserted-by":"crossref","first-page":"10354","DOI":"10.1002\/anie.201605470","article-title":"High-throughput mutational analysis of a twister ribozyme","volume":"55","author":"Kobori","year":"2016","journal-title":"Angew Chem Int Ed Engl"},{"key":"2025081218212335900_btaf427-B8","doi-asserted-by":"crossref","first-page":"1283","DOI":"10.1021\/acssynbio.7b00057","article-title":"Deep sequencing analysis of aptazyme variants based on a pistol ribozyme","volume":"6","author":"Kobori","year":"2017","journal-title":"ACS Synth Biol"},{"key":"2025081218212335900_btaf427-B9","doi-asserted-by":"crossref","first-page":"837","DOI":"10.1126\/science.aae0568","article-title":"The fitness landscape of a tRNA gene","volume":"352","author":"Li","year":"2016","journal-title":"Science"},{"key":"2025081218212335900_btaf427-B10","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1002\/wrna.1201","article-title":"Finding the target sites of RNA-binding proteins","volume":"5","author":"Li","year":"2014","journal-title":"Wiley Interdiscip Rev RNA"},{"key":"2025081218212335900_btaf427-B11","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1093\/biomet\/37.1-2.17","article-title":"Notes on continuous stochastic phenomena","volume":"37","author":"Moran","year":"1950","journal-title":"Biometrika"},{"key":"2025081218212335900_btaf427-B12","doi-asserted-by":"crossref","first-page":"2933","DOI":"10.1093\/bioinformatics\/btt509","article-title":"Infernal 1.1: 100-fold faster RNA homology searches","volume":"29","author":"Nawrocki","year":"2013","journal-title":"Bioinformatics"},{"key":"2025081218212335900_btaf427-B13","doi-asserted-by":"crossref","first-page":"S111","DOI":"10.1134\/S0006297918140109","article-title":"Structural aspects of ribosomal RNA recognition by ribosomal proteins","volume":"83","author":"Nikulin","year":"2018","journal-title":"Biochemistry (Mosc)"},{"key":"2025081218212335900_btaf427-B14","doi-asserted-by":"crossref","first-page":"1452","DOI":"10.1093\/nar\/gkl1172","article-title":"The structure and function of small nucleolar ribonucleoproteins","volume":"35","author":"Reichow","year":"2007","journal-title":"Nucleic Acids Res"},{"year":"2019","author":"Runge","key":"2025081218212335900_btaf427-B15"},{"key":"2025081218212335900_btaf427-B16","doi-asserted-by":"crossref","first-page":"941","DOI":"10.1038\/s41467-021-21194-4","article-title":"RNA secondary structure prediction using deep learning with thermodynamic integration","volume":"12","author":"Sato","year":"2021","journal-title":"Nat Commun"},{"key":"2025081218212335900_btaf427-B17","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.cell.2012.12.024","article-title":"A decade of riboswitches","volume":"152","author":"Serganov","year":"2013","journal-title":"Cell"},{"key":"2025081218212335900_btaf427-B18","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1038\/s41592-023-02148-8","article-title":"Deep generative design of rna family sequences","volume":"21","author":"Sumi","year":"2024","journal-title":"Nat Methods"},{"year":"2024","author":"Tan","key":"2025081218212335900_btaf427-B19"},{"key":"2025081218212335900_btaf427-B20","doi-asserted-by":"crossref","first-page":"1089","DOI":"10.1038\/s41586-023-06415-8","article-title":"De novo design of protein structure and function with RFdiffusion","volume":"620","author":"Watson","year":"2023","journal-title":"Nature"},{"key":"2025081218212335900_btaf427-B21","doi-asserted-by":"publisher","first-page":"829","DOI":"10.1038\/s43588-024-00720-6","article-title":"Deep generative design of RNA aptamers using structural predictions","volume":"4","author":"Wong","year":"2024","journal-title":"Nat Comput Sci"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf427\/63880964\/btaf427.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/8\/btaf427\/63880964\/btaf427.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/8\/btaf427\/63880964\/btaf427.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,12]],"date-time":"2025-08-12T22:21:29Z","timestamp":1755037289000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf427\/8217269"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2025,7,29]]},"references-count":21,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,8,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf427","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2025,8]]},"published":{"date-parts":[[2025,7,29]]},"article-number":"btaf427"}}