{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T06:54:13Z","timestamp":1769151253685,"version":"3.49.0"},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T00:00:00Z","timestamp":1768089600000},"content-version":"vor","delay-in-days":10,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Institute of Information & Communications Technology Planning & Evaluation (IITP)\u2014ICT Challenge and Advanced Network of HRD","award":["IITP-2025-RS-2022-00156439"],"award-info":[{"award-number":["IITP-2025-RS-2022-00156439"]}]},{"name":"Institute of Information & Communications Technology Planning & Evaluation (IITP)\u2014ICT Challenge and Advanced Network of HRD","award":["IITP-2025-RS-2024-00438263"],"award-info":[{"award-number":["IITP-2025-RS-2024-00438263"]}]},{"name":"Bio & Medical Technology Development Program of the National Research Foundation","award":["RS-2025-02217289"],"award-info":[{"award-number":["RS-2025-02217289"]}]},{"name":"Bio & Medical Technology Development Program of the National Research Foundation","award":["2022M3E5F3081268"],"award-info":[{"award-number":["2022M3E5F3081268"]}]},{"DOI":"10.13039\/501100002642","name":"Korea University","doi-asserted-by":"publisher","award":["K2517281"],"award-info":[{"award-number":["K2517281"]}],"id":[{"id":"10.13039\/501100002642","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,1,7]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>RNA performs a variety of functions within cells and is implicated in various human diseases. Because druggable proteins occupy a small portion of the genome, considerable interest has been increasing in developing drugs targeting RNAs. Thus, precise prediction of small-molecule binding sites across different classes of RNAs is important. In this study, a lightweight deep learning program for predicting RNA-drug binding sites, called compound binding site prediction for RNA (CoBRA), is introduced. Our approach utilizes residue-level embeddings derived from a pre-trained RNA language model, without relying on any structural information. These embeddings encapsulate the contextual and statistical properties of each nucleotide and are used as input for a multi-layer perceptron classifier that performs binary classification of binding nucleotides. The model was trained using the TR60 and HARIBOSS datasets and tested on four independent benchmark sets. The performance of CoBRA demonstrates a relative improvement of 22.1% in the Matthew correlation coefficient and a 45.6% increase in sensitivity compared to existing state-of-the-art RNA\u2013ligand binding site prediction methods that utilize structural information. These results demonstrate that sequence-based language model embeddings, which do not require explicit coordinate or distance information, can match or outperform structure-based methods. This makes it a flexible tool for predicting binding sites across diverse RNA targets.<\/jats:p>","DOI":"10.1093\/bib\/bbaf713","type":"journal-article","created":{"date-parts":[[2025,12,20]],"date-time":"2025-12-20T12:53:24Z","timestamp":1766235204000},"source":"Crossref","is-referenced-by-count":0,"title":["CoBRA: compound binding site prediction using RNA language model"],"prefix":"10.1093","volume":"27","author":[{"given":"Wonkyeong","family":"Jang","sequence":"first","affiliation":[{"name":"Department of Biomedical Informatics, Korea University College of Medicine , 161 Jeongneung-ro, Seongbuk-gu, Seoul 02708 ,","place":["Republic of Korea"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3462-0243","authenticated-orcid":false,"given":"Woong-Hee","family":"Shin","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Korea University College of Medicine , 161 Jeongneung-ro, Seongbuk-gu, Seoul 02708 ,","place":["Republic of Korea"]},{"name":"Arontier, Co. , 241 Gangnam-daero, Seocho-gu, Seoul 06735 ,","place":["Republic of Korea"]}]}],"member":"286","published-online":{"date-parts":[[2026,1,11]]},"reference":[{"key":"2026011104420694600_ref1","doi-asserted-by":"publisher","first-page":"861","DOI":"10.1038\/nrg3074","article-title":"Non-coding RNAs in human disease","volume":"12","author":"Esteller","year":"2011","journal-title":"Nat Rev Genet"},{"key":"2026011104420694600_ref2","doi-asserted-by":"publisher","first-page":"D983","DOI":"10.1093\/nar\/gks1099","article-title":"LncRNADisease: a database for long-non-coding RNA-associated diseases","volume":"41","author":"Chen","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2026011104420694600_ref3","doi-asserted-by":"publisher","first-page":"e3420","DOI":"10.1371\/journal.pone.0003420","article-title":"An analysis of human microRNA and disease associations","volume":"3","author":"Lu","year":"2008","journal-title":"PloS One"},{"key":"2026011104420694600_ref4","doi-asserted-by":"publisher","first-page":"1046","DOI":"10.1038\/nmeth.2238","article-title":"The ENCODE project","volume":"10","author":"Souza","year":"2012","journal-title":"Nat Methods"},{"key":"2026011104420694600_ref5","doi-asserted-by":"publisher","first-page":"727","DOI":"10.1038\/nrd892","article-title":"The druggable genome","volume":"1","author":"Hopkins","year":"2002","journal-title":"Nat Rev Drug Discov"},{"key":"2026011104420694600_ref6","doi-asserted-by":"publisher","first-page":"955","DOI":"10.1042\/EBC20200011","article-title":"Targeting RNA structures in diseases with small molecules","volume":"64","author":"Shao","year":"2020","journal-title":"Essays Biochem"},{"key":"2026011104420694600_ref7","doi-asserted-by":"publisher","first-page":"862","DOI":"10.1124\/pr.120.019554","article-title":"RNA drugs and RNA targets for small molecules: principles, progress, and challenges","volume":"72","author":"Yu","year":"2020","journal-title":"Pharmacol Rev"},{"key":"2026011104420694600_ref8","doi-asserted-by":"publisher","first-page":"9179","DOI":"10.1038\/srep09179","article-title":"Rsite: a computational method to identify the functional sites of noncoding RNAs","volume":"5","author":"Zeng","year":"2015","journal-title":"Sci Rep"},{"key":"2026011104420694600_ref9","doi-asserted-by":"publisher","first-page":"19016","DOI":"10.1038\/srep19016","article-title":"Rsite2: an efficient computational method to predict the functional sites of noncoding RNAs","volume":"6","author":"Zeng","year":"2016","journal-title":"Sci Rep"},{"key":"2026011104420694600_ref10","doi-asserted-by":"publisher","first-page":"3131","DOI":"10.1093\/bioinformatics\/bty345","article-title":"RBind: computational network method to predict RNA binding sites","volume":"34","author":"Wang","year":"2018","journal-title":"Bioinformatics"},{"key":"2026011104420694600_ref11","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1093\/bioinformatics\/btaa1092","article-title":"Recognition of small molecule\u2013RNA binding sites using RNA sequence and structure","volume":"37","author":"Su","year":"2021","journal-title":"Bioinformatics"},{"key":"2026011104420694600_ref12","doi-asserted-by":"publisher","first-page":"6979","DOI":"10.1092\/acs.jcim.4c01264","article-title":"Predicting small molecule binding nucleotides in RNA structures using RNA surface topography","volume":"64","author":"Gao","year":"2024","journal-title":"J Chem Inf Model"},{"key":"2026011104420694600_ref13","doi-asserted-by":"publisher","DOI":"10.1016\/j.jmb.2025.169010","article-title":"Identifying RNA-small molecule binding sites using geometric deep learning with language models","volume":"437","author":"Zhu","year":"2025","journal-title":"J Mol Biol"},{"key":"2026011104420694600_ref14","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btaf447","article-title":"RNA language model and graph attention network for RNA and small molecule binding sites prediction","volume":"41","author":"Sun","year":"2025","journal-title":"Bioinformatics"},{"key":"2026011104420694600_ref15","doi-asserted-by":"publisher","first-page":"8448","DOI":"10.1021\/acs.jcim.5c00605","article-title":"GATRsite: RNA\u2013ligand binding site prediction using graph attention networks and pretrained RNA language models","volume":"65","author":"Sun","year":"2025","journal-title":"J Chem Inf Model"},{"key":"2026011104420694600_ref16","doi-asserted-by":"publisher","first-page":"bbaf489","DOI":"10.1093\/bib\/bbaf489","article-title":"MVRBind: multi-view learning for RNA-small molecule binding site prediction","volume":"26","author":"Chen","year":"2025","journal-title":"Brief Bioinform"},{"key":"2026011104420694600_ref17","doi-asserted-by":"publisher","first-page":"4185","DOI":"10.1093\/bioinformatics\/btac483","article-title":"HARIBOSS: a curated database of RNA-small molecules structures to aid rational drug design","volume":"38","author":"Panei","year":"2022","journal-title":"Bioinformatics"},{"key":"2026011104420694600_ref18","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1912.01703","article-title":"Pytorch: an imperative style, high-performance deep learning library","author":"Paszke","year":"2019","journal-title":"arXiv"},{"key":"2026011104420694600_ref19","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1711.05101","article-title":"Decoupled weight decay regularization","author":"Loshchilov","year":"2019","journal-title":"arXiv"},{"key":"2026011104420694600_ref20","doi-asserted-by":"publisher","DOI":"10.1101\/2024.03.17.585376","article-title":"ERNIE-RNA: an RNA language model with structure-enhanced representations","author":"Yin","year":"2024","journal-title":"bioRxiv"},{"key":"2026011104420694600_ref21","doi-asserted-by":"publisher","first-page":"5671","DOI":"10.1038\/s41467-025-60872-5","article-title":"RiNALMo: general-purpose RNA language models can generalize well on structure prediction tasks","volume":"16","author":"Peni\u0107","year":"2025","journal-title":"Nat Commun"},{"key":"2026011104420694600_ref22","doi-asserted-by":"publisher","first-page":"lqac012","DOI":"10.1093\/nargab\/lqac012","article-title":"Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning","volume":"4","author":"Akiyama","year":"2022","journal-title":"NAR Genomics Bioinf"},{"key":"2026011104420694600_ref23","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2204.00300","article-title":"Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions","author":"Chen","year":"2022","journal-title":"arXiv"},{"key":"2026011104420694600_ref24","doi-asserted-by":"publisher","first-page":"e3","DOI":"10.1093\/nar\/gkad1031","article-title":"Multiple sequence alignment-based RNA language model and its application to structural inference","volume":"52","author":"Zhang","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2026011104420694600_ref25","doi-asserted-by":"publisher","first-page":"bbae163","DOI":"10.1093\/bib\/bbae163","article-title":"Self-supervised learning on millions of primary RNA sequences from 72 vertebrates improves sequence-based RNA splicing prediction","volume":"25","author":"Chen","year":"2024","journal-title":"Brief Bioinform"},{"key":"2026011104420694600_ref26","doi-asserted-by":"publisher","first-page":"449","DOI":"10.1038\/s42256-024-00823-9","article-title":"A 5\u2032 UTR language model for decoding untranslated regions of mRNA and function predictions","volume":"6","author":"Chu","year":"2024","journal-title":"Nat Mach Intell"},{"key":"2026011104420694600_ref27","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1901.05555","article-title":"Class-balanced loss based on effective number of samples","author":"Cui","year":"2019","journal-title":"arXiv"},{"key":"2026011104420694600_ref28","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1706.05721","article-title":"Tversky loss function for image segmentation using 3D fully convolutional deep networks","author":"Salehi","year":"2017","journal-title":"arXiv"},{"key":"2026011104420694600_ref29","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1606.04797","article-title":"V-net: fully convolutional neural networks for volumetric medical image segmentation","author":"Milletari","year":"2016","journal-title":"arXiv"},{"key":"2026011104420694600_ref30","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1705.08790","article-title":"The lov\u00e1sz-softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks","author":"Berman","year":"2018","journal-title":"arXiv"},{"key":"2026011104420694600_ref31","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1803.06189","article-title":"Triplet-center loss for multi-view 3d object retrieval","author":"He","year":"2018","journal-title":"arXiv"},{"key":"2026011104420694600_ref32","doi-asserted-by":"publisher","first-page":"125","DOI":"10.1186\/s13321-024-00920-2","article-title":"Protein-small molecule binding site prediction based on a pre-trained protein language model with contrastive learning","volume":"16","author":"Wang","year":"2024","journal-title":"J Chem"},{"key":"2026011104420694600_ref33","doi-asserted-by":"publisher","DOI":"10.1101\/2020.05.05.078014","article-title":"Structures of microRNA-precursor apical junctions and loops reveal non-canonical base pairs important for processing","author":"Shoffner","year":"2020","journal-title":"bioRxiv"},{"key":"2026011104420694600_ref34","doi-asserted-by":"publisher","DOI":"10.1038\/nchembio.1607","article-title":"Structural insights into recognition of c-di-AMP by by the ydaO riboswitch","volume":"10","author":"Gao","year":"2014","journal-title":"Nat Chem Biol"},{"key":"2026011104420694600_ref35","doi-asserted-by":"publisher","first-page":"e74","DOI":"10.1093\/nar\/gkaa426","article-title":"DSSR-enabled innovative schematics of 3D nucleic acid structures with PyMOL","volume":"48","author":"Lu","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2026011104420694600_ref36","doi-asserted-by":"publisher","first-page":"1809","DOI":"10.1021\/acs.jcim.1c00972","article-title":"BiRDS\u2014binding residue detection from protein sequences using deep ResNets","volume":"62","author":"Chelur","year":"2022","journal-title":"J Chem Inf Model"},{"key":"2026011104420694600_ref37","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1146\/annurev.biophys.34.040204.144511","article-title":"Ions and RNA folding","volume":"34","author":"Draper","year":"2005","journal-title":"Annu Rev Biophys Biomol Struct"},{"key":"2026011104420694600_ref38","doi-asserted-by":"publisher","first-page":"198","DOI":"10.1093\/bioinformatics\/btr636","article-title":"MetalionRNA: computational predictor of metal-binding sites in RNA structures","volume":"28","author":"Philips","year":"2012","journal-title":"Bioinformatics"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/27\/1\/bbaf713\/66342166\/bbaf713.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/27\/1\/bbaf713\/66342166\/bbaf713.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T09:42:16Z","timestamp":1768124536000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaf713\/8419945"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1]]},"references-count":38,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1,7]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaf713","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,1]]},"published":{"date-parts":[[2026,1]]},"article-number":"bbaf713"}}