{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:46Z","timestamp":1772138086945,"version":"3.50.1"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2018,6,27]],"date-time":"2018-06-27T00:00:00Z","timestamp":1530057600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Alternative splice site selection is inherently competitive and the probability of a given splice site to be used also depends on the strength of neighboring sites. Here, we present a new model named the competitive splice site model (COSSMO), which explicitly accounts for these competitive effects and predicts the percent selected index (PSI) distribution over any number of putative splice sites. We model an alternative splicing event as the choice of a 3\u2032 acceptor site conditional on a fixed upstream 5\u2032 donor site or the choice of a 5\u2032 donor site conditional on a fixed 3\u2032 acceptor site. We build four different architectures that use convolutional layers, communication layers, long short-term memory and residual networks, respectively, to learn relevant motifs from sequence alone. We also construct a new dataset from genome annotations and RNA-Seq read data that we use to train our model.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>COSSMO is able to predict the most frequently used splice site with an accuracy of 70% on unseen test data, and achieve an R2 of 0.6 in modeling the PSI distribution. We visualize the motifs that COSSMO learns from sequence and show that COSSMO recognizes the consensus splice site sequences and many known splicing factors with high specificity.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Model predictions, our training dataset, and code are available from http:\/\/cossmo.genes.toronto.edu.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty244","type":"journal-article","created":{"date-parts":[[2018,4,16]],"date-time":"2018-04-16T15:11:48Z","timestamp":1523891508000},"page":"i429-i437","source":"Crossref","is-referenced-by-count":54,"title":["COSSMO: predicting competitive alternative splice site selection using deep learning"],"prefix":"10.1093","volume":"34","author":[{"given":"Hannes","family":"Bretschneider","sequence":"first","affiliation":[{"name":"Deep Genomics Inc, Toronto, Canada"},{"name":"Department of Computer Science, University of Toronto, Toronto, Canada"}]},{"given":"Shreshth","family":"Gandhi","sequence":"additional","affiliation":[{"name":"Deep Genomics Inc, Toronto, Canada"}]},{"given":"Amit G","family":"Deshwar","sequence":"additional","affiliation":[{"name":"Deep Genomics Inc, Toronto, Canada"},{"name":"Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada"}]},{"given":"Khalid","family":"Zuberi","sequence":"additional","affiliation":[{"name":"Deep Genomics Inc, Toronto, Canada"}]},{"given":"Brendan J","family":"Frey","sequence":"additional","affiliation":[{"name":"Deep Genomics Inc, Toronto, Canada"},{"name":"Department of Computer Science, University of Toronto, Toronto, Canada"},{"name":"Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada"}]}],"member":"286","published-online":{"date-parts":[[2018,6,27]]},"reference":[{"key":"2023051604324922300_bty244-B1","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning","volume":"33","author":"Alipanahi","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023051604324922300_bty244-B2","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1038\/nature09000","article-title":"Deciphering the splicing code","volume":"465","author":"Barash","year":"2010","journal-title":"Nature"},{"key":"2023051604324922300_bty244-B3","doi-asserted-by":"crossref","first-page":"813","DOI":"10.1261\/rna.048769.114","article-title":"Splicing predictions reliably classify different types of alternative splicing","volume":"21","author":"Busch","year":"2015","journal-title":"RNA"},{"key":"2023051604324922300_bty244-B4","doi-asserted-by":"crossref","first-page":"3078","DOI":"10.1038\/ncomms4078","article-title":"The splicing activator dazap1 integrates splicing control into mek\/erk-regulated cell proliferation and migration","volume":"5","author":"Choudhury","year":"2014","journal-title":"Nat. Commun"},{"key":"2023051604324922300_bty244-B5","doi-asserted-by":"crossref","first-page":"580","DOI":"10.1038\/ng.2653","article-title":"The Genotype-Tissue Expression (GTEx) project","volume":"45","author":"Lonsdale","year":"2013","journal-title":"Nat. Genet"},{"key":"2023051604324922300_bty244-B6","doi-asserted-by":"crossref","first-page":"R24","DOI":"10.1186\/gb-2007-8-2-r24","article-title":"Quantifying similarity between motifs","volume":"8","author":"Gupta","year":"2007","journal-title":"Genome Biology"},{"key":"2023051604324922300_bty244-B7","doi-asserted-by":"crossref","first-page":"1760","DOI":"10.1101\/gr.135350.111","article-title":"Gencode: the reference human genome annotation for the encode project","volume":"22","author":"Harrow","year":"2012","journal-title":"Genome Res"},{"key":"2023051604324922300_bty244-B8","author":"He","year":"2015"},{"key":"2023051604324922300_bty244-B9","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput"},{"key":"2023051604324922300_bty244-B10","doi-asserted-by":"crossref","first-page":"2392","DOI":"10.1093\/molbev\/msl111","article-title":"Intron size, abundance, and distribution within untranslated regions of genes","volume":"23","author":"Hong","year":"2006","journal-title":"Mol. Biol. Evol"},{"key":"2023051604324922300_bty244-B11","author":"Ioffe","year":"2015"},{"key":"2023051604324922300_bty244-B12","author":"Kelley","year":"2018"},{"key":"2023051604324922300_bty244-B13","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.3317","article-title":"HISAT: a fast spliced aligner with low memory requirements","volume":"12","author":"Kim","year":"2015","journal-title":"Nat. Methods"},{"key":"2023051604324922300_bty244-B14","first-page":"i121","article-title":"Deep learning of the tissue-regulated splicing code","volume":"30","author":"Leung","year":"2014","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023051604324922300_bty244-B15","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/978-0-387-77374-2_8","article-title":"hnrnp proteins and splicing control","volume":"623","author":"Martinez-Contreras","year":"2007","journal-title":"Adv. Exp. Med. Biol"},{"key":"2023051604324922300_bty244-B16","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1101\/gad.7.3.393","article-title":"Cloning and characterization of psf, a novel pre-mrna splicing factor","volume":"7","author":"Patton","year":"1993","journal-title":"Genes Dev"},{"key":"2023051604324922300_bty244-B17","doi-asserted-by":"crossref","first-page":"e107.","DOI":"10.1093\/nar\/gkw226","article-title":"DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences","volume":"44","author":"Quang","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023051604324922300_bty244-B18","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1038\/nature12311","article-title":"A compendium of RNA-binding motifs for decoding gene regulation","volume":"499","author":"Ray","year":"2013","journal-title":"Nature"},{"key":"2023051604324922300_bty244-B19","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1038\/nrg.2015.3","article-title":"RNA mis-splicing in disease","volume":"17","author":"Scotti","year":"2016","journal-title":"Nat. Rev. Genet"},{"key":"2023051604324922300_bty244-B20","first-page":"2244","volume-title":"Advances in Neural Information Processing Systems","author":"Sukhbaatar","year":"2016"},{"key":"2023051604324922300_bty244-B21","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1002\/wrna.1141","article-title":"The significant other: splicing by the minor spliceosome","volume":"4","author":"Turunen","year":"2013","journal-title":"Wiley Interdiscip Rev RNA"},{"key":"2023051604324922300_bty244-B22","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1007\/s00439-017-1809-4","article-title":"Deep intronic mutations and human disease","volume":"136","author":"Vaz-Drago","year":"2017","journal-title":"Hum. Genet"},{"key":"2023051604324922300_bty244-B23","doi-asserted-by":"crossref","first-page":"802","DOI":"10.1261\/rna.876308","article-title":"Splicing regulation: from a parts list of regulatory elements to an integrated splicing code","volume":"14","author":"Wang","year":"2008","journal-title":"RNA"},{"key":"2023051604324922300_bty244-B24","first-page":"2554","article-title":"Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context","volume":"27","author":"Xiong","year":"2011","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023051604324922300_bty244-B25","doi-asserted-by":"crossref","first-page":"1254806","DOI":"10.1126\/science.1254806","article-title":"The human splicing code reveals new insights into the genetic determinants of disease","volume":"347","author":"Xiong","year":"2015","journal-title":"Science"},{"key":"2023051604324922300_bty244-B26","author":"Xiong","year":"2016"},{"key":"2023051604324922300_bty244-B27","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1089\/1066527041410418","article-title":"Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals","volume":"11","author":"Yeo","year":"2004","journal-title":"J. Comput. Biol"},{"key":"2023051604324922300_bty244-B28","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1038\/355609a0","article-title":"Cloning and domain structure of the mammalian splicing factor u2af","volume":"355","author":"Zamore","year":"1992","journal-title":"Nature"},{"key":"2023051604324922300_bty244-B29","doi-asserted-by":"crossref","first-page":"919","DOI":"10.1093\/hmg\/7.5.919","article-title":"Statistical features of human exons and their flanking regions","volume":"7","author":"Zhang","year":"1998","journal-title":"Hum. Mol. Genet"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/13\/i429\/50316264\/bioinformatics_34_13_i429.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/13\/i429\/50316264\/bioinformatics_34_13_i429.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T00:33:18Z","timestamp":1684197198000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/13\/i429\/5045709"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,6,27]]},"references-count":29,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2018,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty244","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/255257","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,7,1]]},"published":{"date-parts":[[2018,6,27]]}}}