{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,29]],"date-time":"2026-03-29T07:44:06Z","timestamp":1774770246846,"version":"3.50.1"},"reference-count":53,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2024,5,24]],"date-time":"2024-05-24T00:00:00Z","timestamp":1716508800000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["32122023"],"award-info":[{"award-number":["32122023"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["32070603"],"award-info":[{"award-number":["32070603"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National High Level Hospital Clinical Research Funding","award":["2023-PUMCH-E-008"],"award-info":[{"award-number":["2023-PUMCH-E-008"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,5,23]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>The untranslated region (UTR) of messenger ribonucleic acid (mRNA), including the 5\u2032UTR and 3\u2032UTR, plays a critical role in regulating gene expression and translation. Variants within the UTR can lead to changes associated with human traits and diseases; however, computational prediction of UTR variant effect is challenging. Current noncoding variant prediction mainly focuses on the promoters and enhancers, neglecting the unique sequence of the UTR and thereby limiting their predictive accuracy. In this study, using consolidated datasets of UTR variants from disease databases and large-scale experimental data, we systematically analyzed more than 50 region-specific features of UTR, including functional elements, secondary structure, sequence composition and site conservation. Our analysis reveals that certain features, such as C\/G-related sequence composition in 5\u2032UTR and A\/T-related sequence composition in 3\u2032UTR, effectively differentiate between nonfunctional and functional variant sets, unveiling potential sequence determinants of functional UTR variants. Leveraging these insights, we developed two classification models to predict functional UTR variants using machine learning, achieving an area under the curve (AUC) value of 0.94 for 5\u2032UTR and 0.85 for 3\u2032UTR, outperforming all existing methods. Our models will be valuable for enhancing clinical interpretation of genetic variants, facilitating the prediction and management of disease risk.<\/jats:p>","DOI":"10.1093\/bib\/bbae248","type":"journal-article","created":{"date-parts":[[2024,5,24]],"date-time":"2024-05-24T07:36:29Z","timestamp":1716536189000},"source":"Crossref","is-referenced-by-count":6,"title":["Predicting functional UTR variants by integrating region-specific features"],"prefix":"10.1093","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6789-7129","authenticated-orcid":false,"given":"Guangyu","family":"Li","sequence":"first","affiliation":[{"name":"State Key Laboratory of Common Mechanism Research for Major Diseases; Center for bioinformatics , National Infrastructures for Translational Medicine, , 1 Shuai Fu Yuan, Dongcheng District, Beijing 100005 , China"},{"name":"Institute of Clinical Medicine and Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College , National Infrastructures for Translational Medicine, , 1 Shuai Fu Yuan, Dongcheng District, Beijing 100005 , China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiayu","family":"Wu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Common Mechanism Research for Major Diseases; Center for bioinformatics , National Infrastructures for Translational Medicine, , 1 Shuai Fu Yuan, Dongcheng District, Beijing 100005 , China"},{"name":"Institute of Clinical Medicine and Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College , National Infrastructures for Translational Medicine, , 1 Shuai Fu Yuan, Dongcheng District, Beijing 100005 , China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoyue","family":"Wang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Common Mechanism Research for Major Diseases; Center for bioinformatics , National Infrastructures for Translational Medicine, , 1 Shuai Fu Yuan, Dongcheng District, Beijing 100005 , China"},{"name":"Institute of Clinical Medicine and Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College , National Infrastructures for Translational Medicine, , 1 Shuai Fu Yuan, Dongcheng District, Beijing 100005 , China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2024,5,23]]},"reference":[{"key":"2024052405593100800_ref1","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1016\/j.trecan.2019.02.011","article-title":"The untranslated regions of mRNAs in cancer","volume":"5","author":"Schuster","year":"2019","journal-title":"Trends Cancer"},{"key":"2024052405593100800_ref2","doi-asserted-by":"crossref","first-page":"896","DOI":"10.1101\/gr.242552.118","article-title":"A massively parallel 3\u2032 UTR reporter assay reveals relationships between nucleotide content, sequence conservation, and mRNA destabilization","volume":"29","author":"Litterman","year":"2019","journal-title":"Genome Res"},{"key":"2024052405593100800_ref3","doi-asserted-by":"crossref","first-page":"1617","DOI":"10.1101\/gr.211401","article-title":"Evolutionarily conserved noncoding DNA in the human genome: how much and what for?","volume":"11","author":"Meisler","year":"2001","journal-title":"Genome Res"},{"key":"2024052405593100800_ref4","doi-asserted-by":"crossref","first-page":"1034","DOI":"10.1101\/gr.3715005","article-title":"Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes","volume":"15","author":"Siepel","year":"2005","journal-title":"Genome Res"},{"key":"2024052405593100800_ref5","doi-asserted-by":"crossref","first-page":"729","DOI":"10.1038\/s41588-021-00830-1","article-title":"Functional and structural basis of extreme conservation in vertebrate 5\u2032 untranslated regions","volume":"53","author":"Byeon","year":"2021","journal-title":"Nat Genet"},{"key":"2024052405593100800_ref6","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1038\/nrm.2017.103","article-title":"Functional 5\u2032 UTR mRNA structures in eukaryotic translation regulation and how to find them","volume":"19","author":"Leppek","year":"2018","journal-title":"Nat Rev Mol Cell Biol"},{"key":"2024052405593100800_ref7","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1146\/annurev-genet-120116-024704","article-title":"Regulation by 3\u2032-untranslated regions","volume":"51","author":"Mayr","year":"2017","journal-title":"Annu Rev Genet"},{"key":"2024052405593100800_ref8","doi-asserted-by":"crossref","first-page":"7507","DOI":"10.1073\/pnas.0810916106","article-title":"Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans","volume":"106","author":"Calvo","year":"2009","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2024052405593100800_ref9","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1126\/science.aad4939","article-title":"Comparative genetics. Systematic discovery of cap-independent translation sequences in human and viral genomes","volume":"351","author":"Weingarten-Gabbay","year":"2016","journal-title":"Science"},{"key":"2024052405593100800_ref10","doi-asserted-by":"crossref","first-page":"baab025","DOI":"10.1093\/database\/baab025","article-title":"Human IRES atlas: an integrative platform for studying IRES-driven translational regulation in humans","volume":"2021","author":"Yang","year":"2021","journal-title":"Database (Oxford)"},{"key":"2024052405593100800_ref11","doi-asserted-by":"crossref","first-page":"4943","DOI":"10.1093\/nar\/gkl620","article-title":"Sequence-specific binding of single-stranded RNA: is there a code for recognition?","volume":"34","author":"Auweter","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2024052405593100800_ref12","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1126\/science.1215691","article-title":"miRNA-mediated gene silencing by translational repression followed by mRNA deadenylation and decay","volume":"336","author":"Djuranovic","year":"2012","journal-title":"Science"},{"key":"2024052405593100800_ref13","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1038\/5082","article-title":"Mutation of the CDKN2A 5\u2032 UTR creates an aberrant initiation codon and predisposes to melanoma","volume":"21","author":"Liu","year":"1999","journal-title":"Nat Genet"},{"key":"2024052405593100800_ref14","doi-asserted-by":"crossref","first-page":"6758","DOI":"10.1073\/pnas.0701266104","article-title":"Three functional variants of IFN regulatory factor 5 (IRF5) define risk and protective haplotypes for human lupus","volume":"104","author":"Graham","year":"2007","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2024052405593100800_ref15","doi-asserted-by":"crossref","first-page":"D886","DOI":"10.1093\/nar\/gky1016","article-title":"CADD: predicting the deleteriousness of variants throughout the human genome","volume":"47","author":"Rentzsch","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2024052405593100800_ref16","doi-asserted-by":"crossref","first-page":"480","DOI":"10.1186\/s13059-014-0480-5","article-title":"FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer","volume":"15","author":"Fu","year":"2014","journal-title":"Genome Biol"},{"key":"2024052405593100800_ref17","doi-asserted-by":"crossref","first-page":"618","DOI":"10.1038\/ng.3810","article-title":"Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data","volume":"49","author":"Huang","year":"2017","journal-title":"Nat Genet"},{"key":"2024052405593100800_ref18","doi-asserted-by":"crossref","first-page":"1536","DOI":"10.1093\/bioinformatics\/btv009","article-title":"An integrative approach to predicting the functional effects of non-coding and coding sequence variation","volume":"31","author":"Shihab","year":"2015","journal-title":"Bioinformatics"},{"key":"2024052405593100800_ref19","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1038\/nmeth.2832","article-title":"Functional annotation of noncoding sequence variants","volume":"11","author":"Ritchie","year":"2014","journal-title":"Nat Methods"},{"key":"2024052405593100800_ref20","doi-asserted-by":"crossref","first-page":"667","DOI":"10.1002\/humu.24203","article-title":"Prediction of disease-associated functional variants in noncoding regions through a comprehensive analysis by integrating datasets and features","volume":"42","author":"Lu","year":"2021","journal-title":"Hum Mutat"},{"key":"2024052405593100800_ref21","doi-asserted-by":"crossref","first-page":"1171","DOI":"10.1093\/bioinformatics\/btaa783","article-title":"Annotating high-impact 5\u2032 untranslated region variants with the UTRannotator","volume":"37","author":"Zhang","year":"2021","journal-title":"Bioinformatics"},{"key":"2024052405593100800_ref22","doi-asserted-by":"crossref","first-page":"3926","DOI":"10.1093\/bioinformatics\/btab635","article-title":"Utr.annotation: a tool for annotating genomic variants that could influence post-transcriptional regulation","volume":"37","author":"Liu","year":"2021","journal-title":"Bioinformatics"},{"key":"2024052405593100800_ref23","doi-asserted-by":"crossref","first-page":"803","DOI":"10.1038\/s41587-019-0164-5","article-title":"Human 5\u2032 UTR design and variant effect prediction from a massively parallel translation assay","volume":"37","author":"Sample","year":"2019","journal-title":"Nat Biotechnol"},{"key":"2024052405593100800_ref24","doi-asserted-by":"crossref","first-page":"5247","DOI":"10.1016\/j.cell.2021.08.025","article-title":"Genome-wide functional screen of 3\u2032UTR variants uncovers causal variants for human disease and evolution","volume":"184","author":"Griesemer","year":"2021","journal-title":"Cell"},{"key":"2024052405593100800_ref25","doi-asserted-by":"crossref","first-page":"1623","DOI":"10.1002\/humu.23641","article-title":"ClinVar at five years: delivering on the promise","volume":"39","author":"Landrum","year":"2018","journal-title":"Hum Mutat"},{"key":"2024052405593100800_ref26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s00439-013-1358-4","article-title":"The human gene mutation database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine","volume":"133","author":"Stenson","year":"2014","journal-title":"Hum Genet"},{"key":"2024052405593100800_ref27","doi-asserted-by":"crossref","first-page":"D1047","DOI":"10.1093\/nar\/gkr1182","article-title":"GWASdb: a database for human genetic variants identified by genome-wide association studies","volume":"40","author":"Li","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2024052405593100800_ref28","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nature11632","article-title":"An integrated map of genetic variation from 1,092 human genomes","volume":"491","author":"Genomes Project, C","year":"2012","journal-title":"Nature"},{"key":"2024052405593100800_ref29","doi-asserted-by":"crossref","first-page":"869","DOI":"10.1038\/s41591-020-0893-5","article-title":"The effect of LRRK2 loss-of-function variants in humans","volume":"26","author":"Whiffin","year":"2020","journal-title":"Nat Med"},{"key":"2024052405593100800_ref30","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1038\/nature13835","article-title":"Genetic and epigenetic fine mapping of causal autoimmune disease variants","volume":"518","author":"Farh","year":"2015","journal-title":"Nature"},{"key":"2024052405593100800_ref31","doi-asserted-by":"crossref","first-page":"28387","DOI":"10.1038\/srep28387","article-title":"Genome-wide identification of microRNA-related variants associated with risk of Alzheimer\u2019s disease","volume":"6","author":"Ghanbari","year":"2016","journal-title":"Sci Rep"},{"key":"2024052405593100800_ref32","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1186\/1471-2164-13-44","article-title":"miRdSNP: a database of disease-associated SNPs and microRNA target sites on 3\u2032UTRs of human genes","volume":"13","author":"Bruno","year":"2012","journal-title":"BMC Genomics"},{"key":"2024052405593100800_ref33","doi-asserted-by":"crossref","first-page":"1393","DOI":"10.1038\/ng.3432","article-title":"Large-scale identification of sequence variants influencing human transcription factor occupancy in vivo","volume":"47","author":"Maurano","year":"2015","journal-title":"Nat Genet"},{"key":"2024052405593100800_ref34","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1186\/s13059-021-02305-2","article-title":"Identification of pathogenic variants in cancer genes using base editing screens with editing efficiency correction","volume":"22","author":"Huang","year":"2021","journal-title":"Genome Biol"},{"key":"2024052405593100800_ref35","doi-asserted-by":"crossref","first-page":"3555","DOI":"10.1093\/bioinformatics\/btv402","article-title":"LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants","volume":"31","author":"Machiela","year":"2015","journal-title":"Bioinformatics"},{"key":"2024052405593100800_ref36","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1038\/s41586-022-04558-8","article-title":"A joint NCBI and EMBL-EBI transcript set for clinical genomics and research","volume":"604","author":"Morales","year":"2022","journal-title":"Nature"},{"key":"2024052405593100800_ref37","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1101\/gr.097857.109","article-title":"Detection of nonneutral substitution rates on mammalian phylogenies","volume":"20","author":"Pollard","year":"2010","journal-title":"Genome Res"},{"key":"2024052405593100800_ref38","doi-asserted-by":"crossref","first-page":"3429","DOI":"10.1093\/nar\/gkg599","article-title":"Vienna RNA secondary structure server","volume":"31","author":"Hofacker","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2024052405593100800_ref39","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1186\/1748-7188-6-26","article-title":"ViennaRNA package 2.0","volume":"6","author":"Lorenz","year":"2011","journal-title":"Algorithms Mol Biol"},{"key":"2024052405593100800_ref40","doi-asserted-by":"crossref","first-page":"R24","DOI":"10.1186\/gb-2007-8-2-r24","article-title":"Quantifying similarity between motifs","volume":"8","author":"Gupta","year":"2007","journal-title":"Genome Biol"},{"key":"2024052405593100800_ref41","doi-asserted-by":"crossref","first-page":"994","DOI":"10.1038\/s41588-021-00864-5","article-title":"An atlas of alternative polyadenylation quantitative trait loci contributing to complex trait and disease heritability","volume":"53","author":"Li","year":"2021","journal-title":"Nat Genet"},{"key":"2024052405593100800_ref42","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1093\/nar\/29.1.246","article-title":"ARED: human AU-rich element-containing mRNA database reveals an unexpectedly diverse functional repertoire of encoded proteins","volume":"29","author":"Bakheet","year":"2001","journal-title":"Nucleic Acids Res"},{"key":"2024052405593100800_ref43","doi-asserted-by":"crossref","DOI":"10.1126\/science.aav1741","article-title":"The biochemical basis of microRNA targeting efficacy","volume":"366","author":"McGeary","year":"2019","journal-title":"Science"},{"key":"2024052405593100800_ref44","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1101\/gr.082701.108","article-title":"Most mammalian mRNAs are conserved targets of microRNAs","volume":"19","author":"Friedman","year":"2009","journal-title":"Genome Res"},{"key":"2024052405593100800_ref45","doi-asserted-by":"crossref","first-page":"112840","DOI":"10.1016\/j.celrep.2023.112840","article-title":"Multi-level functional genomics reveals molecular and cellular oncogenicity of patient-based 3\u2032 untranslated region mutations","volume":"42","author":"Schuster","year":"2023","journal-title":"Cell Rep"},{"key":"2024052405593100800_ref46","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab189","article-title":"WEVar: a novel statistical learning framework for predicting noncoding regulatory variants","volume":"22","author":"Wang","year":"2021","journal-title":"Brief Bioinform"},{"key":"2024052405593100800_ref47","doi-asserted-by":"crossref","first-page":"1006","DOI":"10.1093\/bioinformatics\/btt730","article-title":"CrossMap: a versatile tool for coordinate conversion between genome assemblies","volume":"30","author":"Zhao","year":"2014","journal-title":"Bioinformatics"},{"key":"2024052405593100800_ref48","article-title":"A unified approach to interpreting model predictions","volume":"30","author":"Lundberg","year":"2017","journal-title":"Adv Neural Inf Process"},{"key":"2024052405593100800_ref49","doi-asserted-by":"crossref","first-page":"854","DOI":"10.1016\/j.molcel.2018.05.001","article-title":"Sequence, structure, and context preferences of human RNA binding proteins","volume":"70","author":"Dominguez","year":"2018","journal-title":"Mol Cell"},{"key":"2024052405593100800_ref50","doi-asserted-by":"crossref","first-page":"570","DOI":"10.1093\/nar\/gky1185","article-title":"Deciphering human ribonucleoprotein regulatory networks","volume":"47","author":"Mukherjee","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2024052405593100800_ref51","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1038\/s41467-023-35876-8","article-title":"DNA damage and somatic mutations in mammalian cells after irradiation with a nail polish dryer","volume":"14","author":"Zhivagui","year":"2023","journal-title":"Nat Commun"},{"key":"2024052405593100800_ref52","doi-asserted-by":"crossref","first-page":"618","DOI":"10.1126\/science.aag0299","article-title":"Mutational signatures associated with tobacco smoking in human cancer","volume":"354","author":"Alexandrov","year":"2016","journal-title":"Science"},{"key":"2024052405593100800_ref53","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1038\/s41586-021-04043-8","article-title":"Disease variant prediction with deep generative models of evolutionary data","volume":"599","author":"Frazer","year":"2021","journal-title":"Nature"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/4\/bbae248\/57845342\/bbae248.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/4\/bbae248\/57845342\/bbae248.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,24]],"date-time":"2024-05-24T07:37:00Z","timestamp":1716536220000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae248\/7680467"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,23]]},"references-count":53,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,5,23]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae248","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,7]]},"published":{"date-parts":[[2024,5,23]]},"article-number":"bbae248"}}