{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T22:06:06Z","timestamp":1768860366005,"version":"3.49.0"},"reference-count":42,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T00:00:00Z","timestamp":1768780800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:sec>\n                    <jats:title>Introduction<\/jats:title>\n                    <jats:p>Translation initiation and termination are critical regulatory checkpoints in protein synthesis, yet accurate computational prediction of their sites remains challenging due to training data biases and the complexity of full-length transcripts.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>To address these limitations, we present TRANSAID (TRANSlation AI for Detection), a novel deep learning framework that accurately and simultaneously predicts translation initiation (TIS) and termination (TTS) sites from complete transcript sequences. TRANSAID\u2019s hierarchical architecture efficiently processes long transcripts, capturing both local motifs and long-range dependencies. Crucially, the model was trained on a human transcriptome dataset that was rigorously partitioned at the gene level to prevent data leakage and included both protein-coding (NM) and non-coding (NR) transcripts.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>This mixed-training strategy enables TRANSAID to achieve high fidelity, correctly identifying 73.61% of NR transcripts as non-coding. Performance is further enhanced by an integrated biological scoring system, improving \u201cperfect ORF prediction\u201d for coding sequences to 94.94% and \u201ccorrect non-coding prediction\u201d to 82.00%. The human-trained model demonstrates remarkable cross-species applicability, maintaining high accuracy on organisms from mammals to yeast. Beyond annotation, TRANSAID serves as a powerful discovery tool for novel coding events. When applied to long-read sequencing data, it accurately identified previously unannotated protein isoforms validated by mass spectrometry (76.28% validation rate). Furthermore, homology searches of high-scoring ORFs predicted within NR transcripts suggest a strong potential for identifying cryptic translation events.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion<\/jats:title>\n                    <jats:p>As a fully documented open-source tool with a user-friendly web server, TRANSAID provides a powerful and accessible resource for improving transcriptome annotation and proteomic discovery.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.3389\/fbinf.2025.1676149","type":"journal-article","created":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T08:56:47Z","timestamp":1768813007000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["TRANSAID: a hybrid deep learning framework for translation site prediction with integrated biological feature scoring"],"prefix":"10.3389","volume":"5","author":[{"given":"Yan","family":"Li","sequence":"first","affiliation":[{"name":"Department of Breast Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College","place":["Beijing, China"]},{"name":"Department of International Medical Service (Xidan Campus), Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College","place":["Beijing, China"]},{"name":"Breast Disease Diagnosis and Treatment Center, Affiliated Hospital of Qinghai University, Affiliated Cancer Hospital of Qinghai University","place":["Xining, China"]}]},{"given":"Boran","family":"Wang","sequence":"additional","affiliation":[{"name":"Beijing Tiantan Hospital, Capital Medical University","place":["Beijing, China"]}]},{"given":"Zhen","family":"Liu","sequence":"additional","affiliation":[{"name":"Breast Disease Diagnosis and Treatment Center, Affiliated Hospital of Qinghai University, Affiliated Cancer Hospital of Qinghai University","place":["Xining, China"]}]},{"given":"Wei","family":"Wei","sequence":"additional","affiliation":[{"name":"Beijing Friendship Hospital, Capital Medical University","place":["Beijing, China"]}]},{"given":"Caiyi","family":"Fei","sequence":"additional","affiliation":[{"name":"Department of AI and Bioinformatics, Nanjing Chengshi Biopharmaceutical (TheraRNA) Co., Ltd.","place":["Nanjing, China"]}]},{"given":"Shi","family":"Xu","sequence":"additional","affiliation":[{"name":"Department of AI and Bioinformatics, Nanjing Chengshi Biopharmaceutical (TheraRNA) Co., Ltd.","place":["Nanjing, China"]}]},{"given":"Tiyun","family":"Han","sequence":"additional","affiliation":[{"name":"Department of AI and Bioinformatics, Nanjing Chengshi Biopharmaceutical (TheraRNA) Co., Ltd.","place":["Nanjing, China"]}]},{"given":"Wei","family":"Geng","sequence":"additional","affiliation":[{"name":"Department of AI and Bioinformatics, Nanjing Chengshi Biopharmaceutical (TheraRNA) Co., Ltd.","place":["Nanjing, China"]}]},{"given":"Zengding","family":"Wu","sequence":"additional","affiliation":[{"name":"Department of AI and Bioinformatics, Nanjing Chengshi Biopharmaceutical (TheraRNA) Co., Ltd.","place":["Nanjing, China"]}]}],"member":"1965","published-online":{"date-parts":[[2026,1,19]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"4139","DOI":"10.1016\/j.febslet.2012.10.010","article-title":"The 5\u2019 untranslated region of Apaf-1 mRNA directs translation under apoptosis conditions via a 5\u2019 end-dependent scanning mechanism","volume":"586","author":"Andreev","year":"2012","journal-title":"FEBS Lett."},{"key":"B2","doi-asserted-by":"publisher","first-page":"e0141287","DOI":"10.1371\/journal.pone.0141287","article-title":"Continuous distributed representation of biological sequences for deep proteomics and genomics","volume":"10","author":"Asgari","year":"2015","journal-title":"PLOS ONE"},{"key":"B3","doi-asserted-by":"publisher","first-page":"1140","DOI":"10.1126\/science.aay0262","article-title":"Pervasive functional translation of noncanonical human open reading frames","volume":"367","author":"Chen","year":"2020","journal-title":"Science"},{"key":"B4","doi-asserted-by":"publisher","first-page":"gkaf277","DOI":"10.1093\/nar\/gkaf277","article-title":"Analysis of RNA translation with a deep learning architecture provides new insight into translation control","volume":"53","author":"Fan","year":"2025","journal-title":"Nucleic Acids Res."},{"key":"B5","doi-asserted-by":"publisher","first-page":"e1498","DOI":"10.1002\/wrna.1498","article-title":"Unconventional RNA\u2010binding proteins step into the virus\u2013host battlefront","volume":"9","author":"Garcia\u2010Moreno","year":"2018","journal-title":"Wiley Interdiscip. Rev. RNA"},{"key":"B6","doi-asserted-by":"publisher","first-page":"e108475","DOI":"10.1371\/journal.pone.0108475","article-title":"Natural variability of Kozak sequences correlates with function in a zebrafish model","volume":"9","author":"Grzegorski","year":"2014","journal-title":"PLOS ONE"},{"key":"B7","doi-asserted-by":"publisher","first-page":"1413","DOI":"10.1126\/science.aad9868","article-title":"Translational control by 5\u2032-untranslated regions of eukaryotic mRNAs","volume":"352","author":"Hinnebusch","year":"2016","journal-title":"Science"},{"key":"B8","doi-asserted-by":"publisher","first-page":"218","DOI":"10.1126\/science.1168978","article-title":"Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling","volume":"324","author":"Ingolia","year":"2009","journal-title":"Science"},{"key":"B9","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1038\/nrm2838","article-title":"The mechanism of eukaryotic translation initiation and principles of its regulation","volume":"11","author":"Jackson","year":"2010","journal-title":"Nat. Rev. Mol. Cell Biol."},{"key":"B10","doi-asserted-by":"publisher","first-page":"eado5600","DOI":"10.1126\/sciadv.ado5600","article-title":"Large-scale transcript variants dictate neoepitopes for cancer immunotherapy","volume":"11","author":"Ji","year":"2025","journal-title":"Sci. Adv."},{"key":"B11","doi-asserted-by":"publisher","first-page":"1247","DOI":"10.1016\/j.ygeno.2020.11.011","article-title":"Targeting translation regulators improves cancer therapy","volume":"113","author":"Jiang","year":"2021","journal-title":"Genomics"},{"key":"B12","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1016\/j.ccell.2018.07.001","article-title":"Comprehensive analysis of alternative splicing across tumors from 8,705 patients","volume":"34","author":"Kahles","year":"2018","journal-title":"Cancer Cell"},{"key":"B13","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1016\/0092-8674(86)90762-2","article-title":"Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes","volume":"44","author":"Kozak","year":"1986","journal-title":"Cell"},{"key":"B14","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1016\/j.gene.2005.06.037","article-title":"Regulation of translation via mRNA structure in prokaryotes and eukaryotes","volume":"361","author":"Kozak","year":"2005","journal-title":"Gene"},{"key":"B15","doi-asserted-by":"publisher","first-page":"eade2886","DOI":"10.1126\/scitranslmed.ade2886","article-title":"Splicing neoantigen discovery with SNAF reveals shared targets for cancer immunotherapy","volume":"16","author":"Li","year":"2024","journal-title":"Sci. Transl. Med."},{"key":"B16","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1186\/s12859-022-05037-7","article-title":"Deciphering the role of RNA structure in translation efficiency","volume":"23","author":"Lin","year":"2022","journal-title":"BMC Bioinforma."},{"key":"B17","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1142\/S0219720003000216","article-title":"Data mining tools for biological sequences","volume":"1","author":"Liu","year":"2003","journal-title":"J. Bioinform Comput. Biol."},{"key":"B18","doi-asserted-by":"publisher","first-page":"a033092","DOI":"10.1101\/cshperspect.a033092","article-title":"Protein synthesis initiation in eukaryotic cells","volume":"10","author":"Merrick","year":"2018","journal-title":"Cold Spring Harb. Perspect. Biol."},{"key":"B19","article-title":"Efficient estimation of word representations in vector space","author":"Mikolov","year":"2013"},{"key":"B20","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1186\/s13059-022-02624-y","article-title":"Enhanced protein isoform characterization through long-read proteogenomics","volume":"23","author":"Miller","year":"2022","journal-title":"Genome Biol."},{"key":"B21","first-page":"226","article-title":"Neural network prediction of translation initiation sites in eukaryotes: perspectives for EST and genome analysis","volume":"5","author":"Pedersen","year":"1997","journal-title":"Proc. Int. Conf. Intell. Syst. Mol. Biol."},{"key":"B22","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1186\/1471-2148-8-99","article-title":"Both selective and neutral processes drive GC content evolution in the human genome","volume":"8","author":"Pozzoli","year":"2008","journal-title":"BMC Evol. Biol."},{"key":"B23","doi-asserted-by":"publisher","first-page":"e2","DOI":"10.1093\/nargab\/lqz002","article-title":"Conserved regions in long non-coding RNAs contain abundant translation and protein\u2013RNA interaction signatures","volume":"1","author":"Ruiz-Orera","year":"2019","journal-title":"NAR Genomics Bioinforma."},{"key":"B24","doi-asserted-by":"publisher","first-page":"i418","DOI":"10.1093\/bioinformatics\/btm177","article-title":"Translation initiation site prediction on a genomic scale: beauty in simplicity","volume":"23","author":"Saeys","year":"2007","journal-title":"Bioinformatics"},{"key":"B25","doi-asserted-by":"publisher","first-page":"384","DOI":"10.1093\/bioinformatics\/14.5.384","article-title":"Assessing protein coding region integrity in cDNA sequencing projects","volume":"14","author":"Salamov","year":"1998","journal-title":"Bioinformatics"},{"key":"B26","doi-asserted-by":"publisher","first-page":"1281","DOI":"10.1093\/nar\/15.3.1281","article-title":"The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications","volume":"15","author":"Sharp","year":"1987","journal-title":"Nucleic Acids Res."},{"key":"B27","doi-asserted-by":"publisher","first-page":"918","DOI":"10.3389\/fpls.2015.00918","article-title":"Genome-wide survey and comprehensive expression profiling of Aux\/IAA gene family in chickpea and soybean","volume":"6","author":"Singh","year":"2015","journal-title":"Front. Plant Sci."},{"key":"B28","doi-asserted-by":"publisher","first-page":"e1628","DOI":"10.1002\/wrna.1628","article-title":"Translational control in aging and neurodegeneration","volume":"12","author":"Skariah","year":"2021","journal-title":"Wiley Interdiscip. Rev. RNA"},{"key":"B29","doi-asserted-by":"publisher","first-page":"731","DOI":"10.1016\/j.cell.2009.01.042","article-title":"Regulation of translation initiation in eukaryotes: mechanisms and biological targets","volume":"136","author":"Sonenberg","year":"2009","journal-title":"Cell"},{"key":"B30","doi-asserted-by":"publisher","first-page":"214","DOI":"10.1101\/gr.221507.117","article-title":"Conserved non-AUG uORFs revealed by a novel regression analysis of ribosome profiling data","volume":"28","author":"Spealman","year":"2018","journal-title":"Genome Res."},{"key":"B31","doi-asserted-by":"publisher","first-page":"aad3867","DOI":"10.1126\/science.aad3867","article-title":"Translation from the 5\u2032 untranslated region shapes the integrated stress response","volume":"351","author":"Starck","year":"2016","journal-title":"Science"},{"key":"B32","doi-asserted-by":"publisher","first-page":"912","DOI":"10.1101\/gr.5211806","article-title":"Genomic localization of RNA binding proteins reveals links between pre-mRNA processing and transcription","volume":"16","author":"Swinburne","year":"2006","journal-title":"Genome Res."},{"key":"B33","doi-asserted-by":"publisher","first-page":"e78","DOI":"10.1093\/nar\/gkv227","article-title":"Identification of protein coding regions in RNA transcripts","volume":"43","author":"Tang","year":"2015","journal-title":"Nucleic Acids Res."},{"key":"B34","doi-asserted-by":"publisher","first-page":"310","DOI":"10.1186\/s13059-021-02525-6","article-title":"Comprehensive characterization of single-cell full-length isoforms in human and mouse with long-read sequencing","volume":"22","author":"Tian","year":"2021","journal-title":"Genome Biol."},{"key":"B35","doi-asserted-by":"publisher","first-page":"288","DOI":"10.1038\/nrc.2016.27","article-title":"New frontiers in translational control of the cancer genome","volume":"16","author":"Truitt","year":"2016","journal-title":"Nat. Rev. Cancer"},{"key":"B36","doi-asserted-by":"publisher","first-page":"601","DOI":"10.1038\/nrg.2016.85","article-title":"Evolution to the rescue: using comparative genomics to understand long non-coding RNAs","volume":"17","author":"Ulitsky","year":"2016","journal-title":"Nat. Rev. Genet."},{"key":"B37","doi-asserted-by":"publisher","first-page":"242","DOI":"10.1016\/j.cell.2019.05.010","article-title":"The translational landscape of the human heart","volume":"178","author":"van Heesch","year":"2019","journal-title":"Cell"},{"key":"B38","doi-asserted-by":"publisher","first-page":"2487","DOI":"10.26599\/TST.2024.9010125","article-title":"LLM4DEU: fine tuning large language model for medical diagnosis in outpatient and emergency department visits of neurosurgery","volume":"30","author":"Wang","year":"2025","journal-title":"Tsinghua Sci. Technol."},{"key":"B39","doi-asserted-by":"publisher","first-page":"1155","DOI":"10.1038\/s41587-019-0217-9","article-title":"Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome","volume":"37","author":"Wenger","year":"2019","journal-title":"Nat. Biotechnol."},{"key":"B40","doi-asserted-by":"publisher","first-page":"1297","DOI":"10.1038\/s41592-019-0617-2","article-title":"Nanopore native RNA sequencing of a human poly(A) transcriptome","volume":"16","author":"Workman","year":"2019","journal-title":"Nat. Methods"},{"key":"B41","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1802.00810","article-title":"Deep learning for genomics: a concise overview","author":"Yue","year":"2018","journal-title":"ArXiv"},{"key":"B42","doi-asserted-by":"publisher","first-page":"i234","DOI":"10.1093\/bioinformatics\/btx247","article-title":"TITER: predicting translation initiation sites by deep learning","volume":"33","author":"Zhang","year":"2017","journal-title":"Bioinformatics"}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1676149\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T08:56:48Z","timestamp":1768813008000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1676149\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,19]]},"references-count":42,"alternative-id":["10.3389\/fbinf.2025.1676149"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2025.1676149","relation":{},"ISSN":["2673-7647"],"issn-type":[{"value":"2673-7647","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,19]]},"article-number":"1676149"}}