{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T20:05:41Z","timestamp":1775678741643,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1008982","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,5,20]],"date-time":"2021-05-20T00:00:00Z","timestamp":1621468800000}}],"reference-count":55,"publisher":"Public Library of Science (PLoS)","issue":"5","license":[{"start":{"date-parts":[[2021,5,10]],"date-time":"2021-05-10T00:00:00Z","timestamp":1620604800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["01IS18053F"],"award-info":[{"award-number":["01IS18053F"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>The 5\u2019 untranslated region plays a key role in regulating mRNA translation and consequently protein abundance. Therefore, accurate modeling of 5\u2019UTR regulatory sequences shall provide insights into translational control mechanisms and help interpret genetic variants. Recently, a model was trained on a massively parallel reporter assay to predict mean ribosome load (MRL)\u2014a proxy for translation rate\u2014directly from 5\u2019UTR sequence with a high degree of accuracy. However, this model is restricted to sequence lengths investigated in the reporter assay and therefore cannot be applied to the majority of human sequences without a substantial loss of information. Here, we introduced frame pooling, a novel neural network operation that enabled the development of an MRL prediction model for 5\u2019UTRs of any length. Our model shows state-of-the-art performance on fixed length randomized sequences, while offering better generalization performance on longer sequences and on a variety of translation-related genome-wide datasets. Variant interpretation is demonstrated on a 5\u2019UTR variant of the gene HBB associated with beta-thalassemia. Frame pooling could find applications in other bioinformatics predictive tasks. Moreover, our model, released open source, could help pinpoint pathogenic genetic variants.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1008982","type":"journal-article","created":{"date-parts":[[2021,5,10]],"date-time":"2021-05-10T14:57:30Z","timestamp":1620658650000},"page":"e1008982","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":51,"title":["Predicting mean ribosome load for 5\u2019UTR of any length using deep learning"],"prefix":"10.1371","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7570-7877","authenticated-orcid":true,"given":"Alexander","family":"Karollus","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7790-8936","authenticated-orcid":true,"given":"\u017diga","family":"Avsec","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8924-8365","authenticated-orcid":true,"given":"Julien","family":"Gagneur","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2021,5,10]]},"reference":[{"key":"pcbi.1008982.ref001","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1016\/j.cell.2016.03.014","article-title":"On the Dependency of Cellular Protein Levels on mRNA Abundance.","volume":"165","author":"Y Liu","year":"2016","journal-title":"Cell"},{"key":"pcbi.1008982.ref002","doi-asserted-by":"crossref","first-page":"e1005535","DOI":"10.1371\/journal.pcbi.1005535","article-title":"Post-transcriptional regulation across human tissues.","volume":"13","author":"A Franks","year":"2017","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1008982.ref003","doi-asserted-by":"crossref","DOI":"10.15252\/msb.20188513","article-title":"Quantification and discovery of sequence determinants of protein-per-mRNA amount in 29 human tissues","volume":"15","author":"B Eraslan","year":"2019","journal-title":"Mol Syst Biol"},{"key":"pcbi.1008982.ref004","doi-asserted-by":"crossref","first-page":"E19","DOI":"10.1038\/nature22293","article-title":"Can we predict protein from mRNA levels?","volume":"547","author":"N Fortelny","year":"2017","journal-title":"Nature"},{"key":"pcbi.1008982.ref005","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1038\/nature10098","article-title":"Global quantification of mammalian gene expression control","volume":"473","author":"B Schwanh\u00e4usser","year":"2011","journal-title":"Nature"},{"key":"pcbi.1008982.ref006","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1038\/nrm2838","article-title":"The mechanism of eukaryotic translation initiation and principles of its regulation","volume":"11","author":"RJ Jackson","year":"2010","journal-title":"Nat Rev Mol Cell Biol"},{"key":"pcbi.1008982.ref007","doi-asserted-by":"crossref","first-page":"1109","DOI":"10.1016\/0092-8674(78)90039-9","article-title":"How do eucaryotic ribosomes select initiation regions in messenger RNA?","volume":"15","author":"M. Kozak","year":"1978","journal-title":"Cell"},{"key":"pcbi.1008982.ref008","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1146\/annurev-biochem-060713-035802","article-title":"The scanning mechanism of eukaryotic translation initiation","volume":"83","author":"AG Hinnebusch","year":"2014","journal-title":"Annu Rev Biochem"},{"key":"pcbi.1008982.ref009","doi-asserted-by":"crossref","first-page":"748","DOI":"10.15252\/msb.20145136","article-title":"Quantitative analysis of mammalian translation initiation sites by FACS -seq","author":"WL Noderer","year":"2014","journal-title":"Molecular Systems Biology"},{"key":"pcbi.1008982.ref010","doi-asserted-by":"crossref","first-page":"8125","DOI":"10.1093\/nar\/15.20.8125","article-title":"An analysis of 5\u2019-noncoding sequences from 699 vertebrate messenger RNAs","volume":"15","author":"M. Kozak","year":"1987","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1008982.ref011","doi-asserted-by":"crossref","first-page":"2850","DOI":"10.1073\/pnas.83.9.2850","article-title":"Influences of mRNA secondary structure on initiation by eukaryotic ribosomes","volume":"83","author":"M. Kozak","year":"1986","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1008982.ref012","doi-asserted-by":"crossref","first-page":"8301","DOI":"10.1073\/pnas.87.21.8301","article-title":"Downstream secondary structure facilitates recognition of initiator codons by eukaryotic ribosomes","volume":"87","author":"M. Kozak","year":"1990","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1008982.ref013","doi-asserted-by":"crossref","first-page":"7507","DOI":"10.1073\/pnas.0810916106","article-title":"Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans","volume":"106","author":"SE Calvo","year":"2009","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1008982.ref014","doi-asserted-by":"crossref","first-page":"1610","DOI":"10.1101\/gr.193342.115","article-title":"Integrative analysis of RNA, translation, and protein levels reveals distinct regulatory variation across humans","volume":"25","author":"C Cenik","year":"2015","journal-title":"Genome Res"},{"key":"pcbi.1008982.ref015","article-title":"Extensive allele-specific translational regulation in hybrid mice","volume":"11","author":"J Hou","year":"2015","journal-title":"Mol Syst Biol"},{"key":"pcbi.1008982.ref016","article-title":"Characterising the loss-of-function impact of 5\u2019untranslated region variants in whole genome sequence data from 15,708 individuals.","author":"N Whiffin","year":"2019","journal-title":"BioRxiv"},{"key":"pcbi.1008982.ref017","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1038\/5082","article-title":"Mutation of the CDKN2A 5\u2019 UTR creates an aberrant initiation codon and predisposes to melanoma","volume":"21","author":"L Liu","year":"1999","journal-title":"Nat Genet"},{"key":"pcbi.1008982.ref018","doi-asserted-by":"crossref","first-page":"e1474","DOI":"10.1002\/wrna.1474","article-title":"Genetic variants in mRNA untranslated regions","volume":"9","author":"M Steri","year":"2018","journal-title":"Wiley Interdiscip Rev RNA"},{"key":"pcbi.1008982.ref019","first-page":"226","article-title":"Neural network prediction of translation initiation sites in eukaryotes: perspectives for EST and genome analysis","volume":"5","author":"AG Pedersen","year":"1997","journal-title":"Proc Int Conf Intell Syst Mol Biol"},{"key":"pcbi.1008982.ref020","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1093\/bioinformatics\/16.9.799","article-title":"Engineering support vector machine kernels that recognize translation initiation sites","volume":"16","author":"A Zien","year":"2000","journal-title":"Bioinformatics"},{"key":"pcbi.1008982.ref021","doi-asserted-by":"crossref","first-page":"e1005170","DOI":"10.1371\/journal.pcbi.1005170","article-title":"PreTIS: A Tool to Predict Non-canonical 5\u2019 UTR Translational Initiation Sites in Human and Mouse.","author":"K Reuter","year":"2016","journal-title":"PLOS Computational Biology"},{"key":"pcbi.1008982.ref022","doi-asserted-by":"crossref","first-page":"11663","DOI":"10.1038\/ncomms11663","article-title":"Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish.","volume":"7","author":"G-L Chew","year":"2016","journal-title":"Nat Commun."},{"key":"pcbi.1008982.ref023","doi-asserted-by":"crossref","first-page":"i234","DOI":"10.1093\/bioinformatics\/btx247","article-title":"TITER: predicting translation initiation sites by deep learning","volume":"33","author":"S Zhang","year":"2017","journal-title":"Bioinformatics"},{"key":"pcbi.1008982.ref024","doi-asserted-by":"crossref","first-page":"702","DOI":"10.1089\/cmb.2005.12.702","article-title":"A class of edit kernels for SVMs to predict translation initiation sites in eukaryotic mRNAs","volume":"12","author":"H Li","year":"2005","journal-title":"J Comput Biol"},{"key":"pcbi.1008982.ref025","doi-asserted-by":"crossref","first-page":"803","DOI":"10.1038\/s41587-019-0164-5","article-title":"Human 5\u2032 UTR design and variant effect prediction from a massively parallel translation assay","volume":"37","author":"PJ Sample","year":"2019","journal-title":"Nat Biotechnol"},{"key":"pcbi.1008982.ref026","doi-asserted-by":"crossref","DOI":"10.1186\/gb-2002-3-3-reviews0004","article-title":"Untranslated regions of mRNAs","volume":"3","author":"F Mignone","year":"2002","journal-title":"Genome Biol"},{"key":"pcbi.1008982.ref027","article-title":"Network In Network.","author":"M Lin","year":"2013","journal-title":"arXiv [cs.NE]."},{"key":"pcbi.1008982.ref028","article-title":"Striving for Simplicity: The All Convolutional Net.","author":"JT Springenberg","year":"2014","journal-title":"arXiv [cs.LG]."},{"key":"pcbi.1008982.ref029","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1038\/s41576-019-0122-6","article-title":"Deep learning: new computational modelling techniques for genomics","volume":"20","author":"G Eraslan","year":"2019","journal-title":"Nat Rev Genet"},{"key":"pcbi.1008982.ref030","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1038\/s41587-019-0140-0","article-title":"The Kipoi repository accelerates community exchange and reuse of predictive models for genomics","volume":"37","author":"\u017d Avsec","year":"2019","journal-title":"Nat Biotechnol"},{"key":"pcbi.1008982.ref031","first-page":"985","article-title":"Complete motif analysis of sequence requirements for translation initiation at non-AUG start codons","author":"AJ Diaz de Arce","year":"2018","journal-title":"Nucleic Acids Research"},{"key":"pcbi.1008982.ref032","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1126\/science.1168978","article-title":"Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling","volume":"324","author":"NT Ingolia","year":"2009","journal-title":"Science"},{"key":"pcbi.1008982.ref033","doi-asserted-by":"crossref","DOI":"10.1038\/msb.2010.59","article-title":"Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line","volume":"6","author":"C Vogel","year":"2010","journal-title":"Mol Syst Biol"},{"key":"pcbi.1008982.ref034","doi-asserted-by":"crossref","first-page":"e10921","DOI":"10.7554\/eLife.10921","article-title":"Tunable protein synthesis by transcript isoforms in human cells.","volume":"5","author":"SN Floor","year":"2016","journal-title":"Elife"},{"key":"pcbi.1008982.ref035","doi-asserted-by":"crossref","first-page":"e03971","DOI":"10.7554\/eLife.03971","article-title":"Translation of 5\u2032 leaders is pervasive in genes resistant to eIF2 repression.","volume":"4","author":"DE Andreev","year":"2015","journal-title":"Elife"},{"key":"pcbi.1008982.ref036","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1016\/j.molcel.2014.08.028","article-title":"mRNA Destabilization Is the Dominant Effect of Mammalian MicroRNAs by the Time Substantial Repression Ensues","volume":"56","author":"SW Eichhorn","year":"2014","journal-title":"Mol Cell"},{"key":"pcbi.1008982.ref037","doi-asserted-by":"crossref","first-page":"11194","DOI":"10.1038\/ncomms11194","article-title":"Genome-wide assessment of differential translations with ribosome profiling data.","volume":"7","author":"Z Xiao","year":"2016","journal-title":"Nat Commun"},{"key":"pcbi.1008982.ref038","doi-asserted-by":"crossref","first-page":"582","DOI":"10.1038\/nature13319","article-title":"Mass-spectrometry-based draft of the human proteome","volume":"509","author":"M Wilhelm","year":"2014","journal-title":"Nature"},{"key":"pcbi.1008982.ref039","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1101\/gr.097857.109","article-title":"Detection of nonneutral substitution rates on mammalian phylogenies","volume":"20","author":"KS Pollard","year":"2010","journal-title":"Genome Res"},{"key":"pcbi.1008982.ref040","article-title":"Tf-Modisco v0. 4.4. 2-Alpha.","author":"A Shrikumar","year":"2018","journal-title":"arXiv preprint arXiv:1811 00416"},{"key":"pcbi.1008982.ref041","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1046\/j.1365-2141.2003.04754.x","article-title":"\u03b2+ 45 G\u2192 C: a novel silent \u03b2-thalassaemia mutation, the first in the Kozak sequence","volume":"124","author":"M De Angioletti","year":"2004","journal-title":"Br J Haematol"},{"key":"pcbi.1008982.ref042","doi-asserted-by":"crossref","first-page":"67","DOI":"10.3109\/03630269109072485","article-title":"The G\u2014-A mutation at position+ 22 3\u2019to the Cap site of the beta-globin gene as a possible cause for a beta-thalassemia","volume":"15","author":"R Oner","year":"1991","journal-title":"Hemoglobin"},{"key":"pcbi.1008982.ref043","article-title":"Reverse-complement parameter sharing improves deep learning models for genomics","author":"A Shrikumar","year":"2017","journal-title":"bioRxiv"},{"key":"pcbi.1008982.ref044","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory.","volume":"9","author":"S Hochreiter","year":"1997","journal-title":"Neural Comput"},{"key":"pcbi.1008982.ref045","doi-asserted-by":"crossref","DOI":"10.1126\/science.aad4939","article-title":"Comparative genetics. Systematic discovery of cap-independent translation sequences in human and viral genomes","volume":"351","author":"S Weingarten-Gabbay","year":"2016","journal-title":"Science"},{"key":"pcbi.1008982.ref046","doi-asserted-by":"crossref","first-page":"927","DOI":"10.1080\/15476286.2016.1212802","article-title":"Toward a systematic understanding of translational regulatory elements in human and viruses","volume":"13","author":"S Weingarten-Gabbay","year":"2016","journal-title":"RNA Biol"},{"key":"pcbi.1008982.ref047","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. pp. 770\u2013778.","DOI":"10.1109\/CVPR.2016.90"},{"key":"pcbi.1008982.ref048","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","journal-title":"Scikit-learn: Machine learning in Python. the Journal of machine Learning research"},{"key":"pcbi.1008982.ref049","unstructured":"Bergstra J, Yamins D, Cox D. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. In: Dasgupta S, McAllester D, editors. Proceedings of the 30th International Conference on Machine Learning. Atlanta, Georgia, USA: PMLR; 2013. pp. 115\u2013123."},{"key":"pcbi.1008982.ref050","article-title":"Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes.","author":"KJ Karczewski","year":"2019","journal-title":"BioRxiv"},{"key":"pcbi.1008982.ref051","article-title":"Towards better understanding of gradient-based attribution methods for Deep Neural Networks.","author":"M Ancona","year":"2017","journal-title":"arXiv [cs.LG]."},{"key":"pcbi.1008982.ref052","doi-asserted-by":"crossref","first-page":"1261","DOI":"10.1093\/bioinformatics\/btx727","article-title":"Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks","volume":"34","author":"\u017d Avsec","year":"2018","journal-title":"Bioinformatics"},{"key":"pcbi.1008982.ref053","author":"A Shrikumar","journal-title":"Gkmexplain: Fast and Accurate Interpretation of Nonlinear Gapped k-mer SVMs Using Integrated Gradients"},{"key":"pcbi.1008982.ref054","article-title":"Base-resolution models of transcription factor binding reveal soft motif syntax.","author":"\u017d Avsec","year":"2020","journal-title":"bioRxiv."},{"key":"pcbi.1008982.ref055","unstructured":"Shrikumar A, Greenside P, Kundaje A. Learning Important Features Through Propagating Activation Differences. Proceedings of the 34th International Conference on Machine Learning\u2014Volume 70. Sydney, NSW, Australia: JMLR.org; 2017. pp. 3145\u20133153."}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1008982","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,5,20]],"date-time":"2021-05-20T00:00:00Z","timestamp":1621468800000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1008982","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,5,20]],"date-time":"2021-05-20T14:11:08Z","timestamp":1621519868000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1008982"}},"subtitle":[],"editor":[{"given":"Predrag","family":"Radivojac","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,5,10]]},"references-count":55,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2021,5,10]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1008982","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.06.15.152728","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,5,10]]}}}