{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,15]],"date-time":"2026-06-15T10:35:46Z","timestamp":1781519746457,"version":"3.54.1"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"22","license":[{"start":{"date-parts":[[2017,8,23]],"date-time":"2017-08-23T00:00:00Z","timestamp":1503446400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/100000060","name":"National Institute of Allergy and Infectious Diseases","doi-asserted-by":"publisher","award":["HHSN272201200010C"],"award-info":[{"award-number":["HHSN272201200010C"]}],"id":[{"id":"10.13039\/100000060","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,11,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Deep neural network architectures such as convolutional and long short-term memory networks have become increasingly popular as machine learning tools during the recent years. The availability of greater computational resources, more data, new algorithms for training deep models and easy to use libraries for implementation and training of neural networks are the drivers of this development. The use of deep learning has been especially successful in image recognition; and the development of tools, applications and code examples are in most cases centered within this field rather than within biology.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Here, we aim to further the development of deep learning methods within biology by providing application examples and ready to apply and adapt code templates. Given such examples, we illustrate how architectures consisting of convolutional and long short-term memory neural networks can relatively easily be designed and trained to state-of-the-art performance on three biological sequence problems: prediction of subcellular localization, protein secondary structure and the binding of peptides to MHC Class II molecules.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>All implementations and datasets are available online to the scientific community at https:\/\/github.com\/vanessajurtz\/lasagne4bio.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx531","type":"journal-article","created":{"date-parts":[[2017,8,22]],"date-time":"2017-08-22T19:15:24Z","timestamp":1503429324000},"page":"3685-3690","source":"Crossref","is-referenced-by-count":140,"title":["An introduction to deep learning on biological sequence data: examples and solutions"],"prefix":"10.1093","volume":"33","author":[{"given":"Vanessa Isabell","family":"Jurtz","sequence":"first","affiliation":[{"name":"Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Alexander Rosenberg","family":"Johansen","sequence":"additional","affiliation":[{"name":"Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7885-4311","authenticated-orcid":false,"given":"Morten","family":"Nielsen","sequence":"additional","affiliation":[{"name":"Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark"},{"name":"Instituto de Investigaciones Biotecnol\u00f3gicas, Universidad Nacional de San Mart\u00edn, Buenos Aires, Argentina"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jose Juan","family":"Almagro Armenteros","sequence":"additional","affiliation":[{"name":"Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Henrik","family":"Nielsen","sequence":"additional","affiliation":[{"name":"Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Casper Kaae","family":"S\u00f8nderby","sequence":"additional","affiliation":[{"name":"Department of Biology, University of Copenhagen, Copenhagen, Denmark"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ole","family":"Winther","sequence":"additional","affiliation":[{"name":"Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark"},{"name":"Department of Biology, University of Copenhagen, Copenhagen, Denmark"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"S\u00f8ren Kaae","family":"S\u00f8nderby","sequence":"additional","affiliation":[{"name":"Department of Biology, University of Copenhagen, Copenhagen, Denmark"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2017,8,23]]},"reference":[{"key":"2023051308382838300_btx531-B1","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning","volume":"33","author":"Alipanahi","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023051308382838300_btx531-B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023051308382838300_btx531-B3","doi-asserted-by":"crossref","first-page":"e26781.","DOI":"10.1371\/journal.pone.0026781","article-title":"NNAlign: a web-based prediction method allowing non-expert end-user discovery of sequence motifs in quantitative peptide data","volume":"6","author":"Andreatta","year":"2011","journal-title":"PLoS One"},{"key":"2023051308382838300_btx531-B4","volume-title":"Proceedings of International Conference on Learning Representations (ICLR)","author":"Bahdanau","year":"2015"},{"key":"2023051308382838300_btx531-B5","volume-title":"arXiv e-prints","author":"Bastien","year":"2016"},{"key":"2023051308382838300_btx531-B6","doi-asserted-by":"crossref","first-page":"5363","DOI":"10.1021\/pr900665y","article-title":"SherLoc2: a high-accuracy hybrid method for predicting subcellular localization of proteins","volume":"8","author":"Briesemeister","year":"2009","journal-title":"J. Proteome Res"},{"key":"2023051308382838300_btx531-B7","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1016\/S0198-8859(97)00078-5","article-title":"Antigen presentation by MHC class II molecules: invariant chain function, protein trafficking, and the molecular basis of diverse determinant capture","volume":"54","author":"Castellino","year":"1997","journal-title":"Hum. Immunol"},{"key":"2023051308382838300_btx531-B8","doi-asserted-by":"crossref","first-page":"1882","DOI":"10.1118\/1.4944498","article-title":"Urinary bladder segmentation in CT urography using deep-learning convolutional neural network and level sets","volume":"43","author":"Cha","year":"2016","journal-title":"Med. Phys"},{"key":"2023051308382838300_btx531-B9","author":"Ciresan","year":"2011"},{"key":"2023051308382838300_btx531-B10","author":"Dieleman","year":"2015"},{"key":"2023051308382838300_btx531-B11","doi-asserted-by":"crossref","first-page":"1042","DOI":"10.1126\/science.1219021","article-title":"The protein-folding problem, 50\u2009years on","volume":"338","author":"Dill","year":"2012","journal-title":"Science"},{"key":"2023051308382838300_btx531-B12","doi-asserted-by":"crossref","first-page":"1035","DOI":"10.1038\/nbt0804-1035","article-title":"Where did the BLOSUM62 alignment score matrix come from?","volume":"22","author":"Eddy","year":"2004","journal-title":"Nat. Biotechnol"},{"key":"2023051308382838300_btx531-B13","doi-asserted-by":"crossref","first-page":"953","DOI":"10.1038\/nprot.2007.131","article-title":"Locating proteins in the cell using TargetP, SignalP and related tools","volume":"2","author":"Emanuelsson","year":"2007","journal-title":"Nat. Protoc"},{"key":"2023051308382838300_btx531-B14","author":"Geiger","year":"2014"},{"key":"2023051308382838300_btx531-B15","author":"Glorot","year":"2010"},{"key":"2023051308382838300_btx531-B16","author":"Goodfellow","year":"2016"},{"key":"2023051308382838300_btx531-B17","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-24797-2","volume-title":"Supervised Sequence Labelling with Recurrent Neural Networks","author":"Graves","year":"2012"},{"key":"2023051308382838300_btx531-B18","author":"Hinton","year":"2012"},{"key":"2023051308382838300_btx531-B19","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1126\/science.1127647","article-title":"Reducing the dimensionality of data with neural networks","volume":"313","author":"Hinton","year":"2006","journal-title":"Science"},{"key":"2023051308382838300_btx531-B20","doi-asserted-by":"crossref","first-page":"1158","DOI":"10.1093\/bioinformatics\/btl002","article-title":"MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition","volume":"22","author":"H\u00f6glund","year":"2006","journal-title":"Bioinformatics"},{"key":"2023051308382838300_btx531-B21","first-page":"448","volume-title":"Proceedings of the 32nd International Conference on Machine Learning","author":"Ioffe","year":"2015"},{"key":"2023051308382838300_btx531-B22","author":"Jaderberg","year":"2015"},{"key":"2023051308382838300_btx531-B23","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023051308382838300_btx531-B24","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features","volume":"22","author":"Kabsch","year":"1983","journal-title":"Biopolymers"},{"key":"2023051308382838300_btx531-B25","doi-asserted-by":"crossref","first-page":"711","DOI":"10.1007\/s00251-013-0720-y","article-title":"NetMHCIIpan-3.0, a common pan-specific MHC class II prediction method including all three human MHC class II isotypes, HLA-DR, HLA-DP and HLA-DQ","volume":"65","author":"Karosiene","year":"2013","journal-title":"Immunogenetics"},{"key":"2023051308382838300_btx531-B26","volume-title":"Proceedings of International Conference on Learning Representations (ICLR)","author":"Kingma","year":"2015"},{"key":"2023051308382838300_btx531-B27","first-page":"1097","volume-title":"Advances in Neural Information Processing Systems 25","author":"Krizhevsky","year":"2012"},{"key":"2023051308382838300_btx531-B28","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"2023051308382838300_btx531-B29","doi-asserted-by":"crossref","first-page":"i121","DOI":"10.1093\/bioinformatics\/btu277","article-title":"Deep learning of the tissue-regulated splicing code","volume":"30","author":"Leung","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051308382838300_btx531-B30","volume-title":"Molecular Cell Biology","author":"Lodish","year":"2016","edition":"8th ed."},{"key":"2023051308382838300_btx531-B32","first-page":"1252","author":"Moeskops","year":"2016"},{"key":"2023051308382838300_btx531-B33","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1111\/j.1365-2567.2010.03268.x","article-title":"MHC class II epitope predictive algorithms","volume":"130","author":"Nielsen","year":"2010","journal-title":"Immunology"},{"key":"2023051308382838300_btx531-B34","doi-asserted-by":"crossref","first-page":"296.","DOI":"10.1186\/1471-2105-10-296","article-title":"NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction","volume":"10","author":"Nielsen","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023051308382838300_btx531-B35","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1038\/nri3818","article-title":"The ins and outs of MHC class II-mediated antigen processing and presentation","volume":"15","author":"Roche","year":"2015","journal-title":"Nat. Rev. Immunol"},{"key":"2023051308382838300_btx531-B36","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1016\/j.neunet.2014.09.003","article-title":"Deep learning in neural networks: An overview","volume":"61","author":"Schmidhuber","year":"2015","journal-title":"Neural Netw"},{"key":"2023051308382838300_btx531-B37","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1007\/978-3-319-21233-3_6","volume-title":"Algorithms for Computational Biology","author":"S\u00f8nderby","year":"2015"},{"key":"2023051308382838300_btx531-B38","author":"S\u00f8nderby","year":"2014"},{"key":"2023051308382838300_btx531-B39","first-page":"3104","volume-title":"Advances in Neural Information Processing Systems","author":"Sutskever","year":"2014"},{"key":"2023051308382838300_btx531-B40","doi-asserted-by":"crossref","first-page":"18962.","DOI":"10.1038\/srep18962","article-title":"Protein secondary structure prediction using deep convolutional neural fields","volume":"6","author":"Wang","year":"2016","journal-title":"Sci. Rep"},{"key":"2023051308382838300_btx531-B31","author":"William,L.H. (2009) Machine Learning-Encyclopedia Britannica"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/22\/3685\/50307397\/bioinformatics_33_22_3685.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/22\/3685\/50307397\/bioinformatics_33_22_3685.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,13]],"date-time":"2023-05-13T08:38:57Z","timestamp":1683967137000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/22\/3685\/4092933"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2017,8,23]]},"references-count":40,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2017,11,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx531","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,11,15]]},"published":{"date-parts":[[2017,8,23]]}}}