{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T15:14:33Z","timestamp":1776352473445,"version":"3.51.2"},"reference-count":53,"publisher":"Oxford University Press (OUP)","issue":"15","license":[{"start":{"date-parts":[[2020,5,16]],"date-time":"2020-05-16T00:00:00Z","timestamp":1589587200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"USDA Hatch"},{"DOI":"10.13039\/100007263","name":"Virginia Tech","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100007263","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Transposable elements (TEs) classification is an essential step to decode their roles in genome evolution. With a large number of genomes from non-model species becoming available, accurate and efficient TE classification has emerged as a new challenge in genomic sequence analysis.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We developed a novel tool, DeepTE, which classifies unknown TEs using convolutional neural networks (CNNs). DeepTE transferred sequences into input vectors based on k-mer counts. A tree structured classification process was used where eight models were trained to classify TEs into super families and orders. DeepTE also detected domains inside TEs to correct false classification. An additional model was trained to distinguish between non-TEs and TEs in plants. Given unclassified TEs of different species, DeepTE can classify TEs into seven orders, which include 15, 24 and 16 super families in plants, metazoans and fungi, respectively. In several benchmarking tests, DeepTE outperformed other existing tools for TE classification. In conclusion, DeepTE successfully leverages CNN for TE classification, and can be used to precisely classify TEs in newly sequenced eukaryotic genomes.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>DeepTE is accessible at https:\/\/github.com\/LiLabAtVT\/DeepTE.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa519","type":"journal-article","created":{"date-parts":[[2020,5,12]],"date-time":"2020-05-12T15:10:16Z","timestamp":1589296216000},"page":"4269-4275","source":"Crossref","is-referenced-by-count":198,"title":["DeepTE: a computational method for\n                    <i>de novo<\/i>\n                    classification of transposons with convolutional neural network"],"prefix":"10.1093","volume":"36","author":[{"given":"Haidong","family":"Yan","sequence":"first","affiliation":[{"name":"School of Plant and Environmental Sciences (SPES), Virginia Tech , Blacksburg, VA 24061, USA"}]},{"given":"Aureliano","family":"Bombarely","sequence":"additional","affiliation":[{"name":"School of Plant and Environmental Sciences (SPES), Virginia Tech , Blacksburg, VA 24061, USA"},{"name":"Department of Life Sciences, University of Milan , Milan 20122, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8133-3944","authenticated-orcid":false,"given":"Song","family":"Li","sequence":"additional","affiliation":[{"name":"School of Plant and Environmental Sciences (SPES), Virginia Tech , Blacksburg, VA 24061, USA"},{"name":"Graduate Program in Genetics, Bioinformatics and Computational Biology (GBCB), Virginia Tech , Blacksburg, VA 24061, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,5,16]]},"reference":[{"key":"2023062312042251700_btaa519-B1","first-page":"1","volume-title":"Encyclopedia of Research Design","author":"Abdi","year":"2010"},{"key":"2023062312042251700_btaa519-B2","doi-asserted-by":"crossref","first-page":"1329","DOI":"10.1093\/bioinformatics\/btp084","article-title":"TEclass\u2014a tool for automated classification of unknown eukaryotic transposable elements","volume":"25","author":"Abrus\u00e1n","year":"2009","journal-title":"Bioinformatics"},{"key":"2023062312042251700_btaa519-B3","author":"Agarap","year":"2018"},{"key":"2023062312042251700_btaa519-B4","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning","volume":"33","author":"Alipanahi","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023062312042251700_btaa519-B5","doi-asserted-by":"crossref","first-page":"i237","DOI":"10.1093\/bioinformatics\/bty228","article-title":"Convolutional neural networks for classification of alignments of non-coding RNA sequences","volume":"34","author":"Aoki","year":"2018","journal-title":"Bioinformatics"},{"key":"2023062312042251700_btaa519-B6","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/s13100-015-0041-9","article-title":"Repbase update, a database of repetitive elements in eukaryotic genomes","volume":"6","author":"Bao","year":"2015","journal-title":"Mob. DNA"},{"key":"2023062312042251700_btaa519-B7","first-page":"806","author":"Barandela","year":"2004"},{"key":"2023062312042251700_btaa519-B8","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1186\/s13059-018-1577-z","article-title":"Ten things you should know about transposable elements","volume":"19","author":"Bourque","year":"2018","journal-title":"Genome Biol"},{"key":"2023062312042251700_btaa519-B9","author":"Chollet","year":"2015"},{"key":"2023062312042251700_btaa519-B10","doi-asserted-by":"crossref","first-page":"348","DOI":"10.1186\/s12859-018-2376-y","article-title":"MITE Tracker: an accurate approach to identify miniature inverted-repeat transposable elements in large genomes","volume":"19","author":"Crescente","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023062312042251700_btaa519-B11","doi-asserted-by":"crossref","first-page":"S11","DOI":"10.1186\/1471-2105-11-S11-S11","article-title":"MiRenSVM: towards better prediction of microRNA precursors using an ensemble SVM classifier with multi-loop features","volume":"11","author":"Ding","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023062312042251700_btaa519-B12","author":"Eddy","year":"2010"},{"key":"2023062312042251700_btaa519-B13","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1186\/1471-2105-9-18","article-title":"LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons","volume":"9","author":"Ellinghaus","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023062312042251700_btaa519-B14","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1038\/s41576-019-0122-6","article-title":"Deep learning: new computational modelling techniques for genomics","volume":"20","author":"Eraslan","year":"2019","journal-title":"Nat. Rev. Genet"},{"key":"2023062312042251700_btaa519-B1115"},{"key":"2023062312042251700_btaa519-B15","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1186\/s12859-018-2182-6","article-title":"Deep learning models for bacteria taxonomic classification of metagenomic data","volume":"19","author":"Fiannaca","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023062312042251700_btaa519-B16","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1016\/j.artmed.2015.06.002","article-title":"A k-mer-based barcode DNA classification methodology based on spectral representation and a neural gas network","volume":"64","author":"Fiannaca","year":"2015","journal-title":"Artif. Intell. Med"},{"key":"2023062312042251700_btaa519-B17","doi-asserted-by":"crossref","first-page":"e16526","DOI":"10.1371\/journal.pone.0016526","article-title":"Considering transposable element diversification in de novo annotation approaches","volume":"6","author":"Flutre","year":"2011","journal-title":"PLoS One"},{"key":"2023062312042251700_btaa519-B18","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1016\/j.ygeno.2012.07.004","article-title":"Characterization and functional annotation of nested transposable elements in eukaryotic genomes","volume":"100","author":"Gao","year":"2012","journal-title":"Genomics"},{"key":"2023062312042251700_btaa519-B19","doi-asserted-by":"crossref","first-page":"e1003711","DOI":"10.1371\/journal.pcbi.1003711","article-title":"Enhanced regulatory sequence prediction using gapped k-mer features","volume":"10","author":"Ghandi","year":"2014","journal-title":"PLoS Comput. Biol"},{"key":"2023062312042251700_btaa519-B20","doi-asserted-by":"crossref","first-page":"688","DOI":"10.1038\/s41576-018-0050-x","article-title":"Computational tools to unmask transposable elements","volume":"19","author":"Goerner-Potvin","year":"2018","journal-title":"Nat. Rev. Genet"},{"key":"2023062312042251700_btaa519-B21","doi-asserted-by":"crossref","first-page":"e199","DOI":"10.1093\/nar\/gkq862","article-title":"MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences","volume":"38","author":"Han","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023062312042251700_btaa519-B22","doi-asserted-by":"crossref","first-page":"e91929","DOI":"10.1371\/journal.pone.0091929","article-title":"PASTEC: an automatic transposable element classification tool","volume":"9","author":"Hoede","year":"2014","journal-title":"PLoS One"},{"key":"2023062312042251700_btaa519-B23","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1186\/s12920-018-0418-y","article-title":"MiteFinderII: a novel tool to identify miniature inverted-repeat transposable elements hidden in eukaryotic genomes","volume":"11","author":"Hu","year":"2018","journal-title":"BMC Med. Genomics"},{"key":"2023062312042251700_btaa519-B24","doi-asserted-by":"crossref","first-page":"e99982","DOI":"10.1371\/journal.pone.0099982","article-title":"Effective automated feature construction and selection for classification of biological sequences","volume":"9","author":"Kamath","year":"2014","journal-title":"PLoS One"},{"key":"2023062312042251700_btaa519-B25","doi-asserted-by":"crossref","first-page":"990","DOI":"10.1101\/gr.200535.115","article-title":"Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks","volume":"26","author":"Kelley","year":"2016","journal-title":"Genome Res"},{"key":"2023062312042251700_btaa519-B26","author":"Kingma","year":"2014"},{"key":"2023062312042251700_btaa519-B27","first-page":"1106","author":"Krizhevsky","year":"2012"},{"key":"2023062312042251700_btaa519-B28","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1104\/pp.107.110353","article-title":"TEnest: automated chronological annotation and visualization of nested plant transposable elements","volume":"146","author":"Kronmiller","year":"2008","journal-title":"Plant Physiol"},{"key":"2023062312042251700_btaa519-B29","doi-asserted-by":"crossref","first-page":"2167","DOI":"10.1101\/gr.121905.111","article-title":"Discriminative prediction of mammalian enhancers from DNA sequence","volume":"21","author":"Lee","year":"2011","journal-title":"Genome Res"},{"key":"2023062312042251700_btaa519-B30","volume-title":"The NCBI Handbook, National Library of Medicine","author":"Madden","year":"2013"},{"key":"2023062312042251700_btaa519-B31","doi-asserted-by":"crossref","first-page":"587","DOI":"10.18388\/abp.2001_3893","article-title":"The human genome structure and organization","volume":"48","author":"Maka\u0142owski","year":"2001","journal-title":"Acta Biochim. Pol"},{"key":"2023062312042251700_btaa519-B32","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1093\/bioinformatics\/btf878","article-title":"LTR_STRUC: a novel search and identification program for LTR retrotransposons","volume":"19","author":"McCarthy","year":"2003","journal-title":"Bioinformatics"},{"key":"2023062312042251700_btaa519-B33","doi-asserted-by":"crossref","first-page":"280","DOI":"10.4236\/jbise.2016.95021","article-title":"DNA sequence classification by convolutional neural network","volume":"9","author":"Nguyen","year":"2016","journal-title":"J. Biomed. Sci. Eng"},{"key":"2023062312042251700_btaa519-B34","doi-asserted-by":"crossref","first-page":"1410","DOI":"10.1104\/pp.17.01310","article-title":"LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons","volume":"176","author":"Ou","year":"2018","journal-title":"Plant Physiol"},{"key":"2023062312042251700_btaa519-B35","doi-asserted-by":"crossref","first-page":"825","DOI":"10.1038\/nbt.3313","article-title":"Deep learning for regulatory genomics","volume":"33","author":"Park","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023062312042251700_btaa519-B36","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1093\/gbe\/evw009","article-title":"Accurate transposable element annotation is vital when analyzing new genome assemblies","volume":"8","author":"Platt","year":"2016","journal-title":"Genome Biol. Evol"},{"key":"2023062312042251700_btaa519-B37","doi-asserted-by":"crossref","first-page":"e22","DOI":"10.1371\/journal.pcbi.0010022","article-title":"Combined evidence annotation of transposable elements in genome sequences","volume":"1","author":"Quesneville","year":"2005","journal-title":"PLoS Comput. Biol"},{"key":"2023062312042251700_btaa519-B39","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1186\/1471-2164-8-90","article-title":"De novo identification of LTR retrotransposons in eukaryotic genomes","volume":"8","author":"Rho","year":"2007","journal-title":"BMC Genomics"},{"key":"2023062312042251700_btaa519-B40","doi-asserted-by":"crossref","first-page":"765","DOI":"10.1126\/science.274.5288.765","article-title":"Nested retrotransposons in the intergenic regions of the maize genome","volume":"274","author":"SanMiguel","year":"1996","journal-title":"Science"},{"key":"2023062312042251700_btaa519-B41","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1016\/j.neunet.2014.09.003","article-title":"Deep learning in neural networks: an overview","volume":"61","author":"Schmidhuber","year":"2015","journal-title":"Neural Netw"},{"key":"2023062312042251700_btaa519-B42","doi-asserted-by":"crossref","first-page":"1112","DOI":"10.1126\/science.1178534","article-title":"The B73 maize genome: complexity, diversity, and dynamics","volume":"326","author":"Schnable","year":"2009","journal-title":"Science"},{"key":"2023062312042251700_btaa519-B43","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/j.plantsci.2019.03.020","article-title":"Machine learning approaches and their current application in plant molecular biology: a systematic review","volume":"284","author":"Silva","year":"2019","journal-title":"Plant Sci"},{"key":"2023062312042251700_btaa519-B44","author":"Smit","year":"2008"},{"key":"2023062312042251700_btaa519-B45","doi-asserted-by":"crossref","first-page":"D1141","DOI":"10.1093\/nar\/gkv1130","article-title":"PGSB PlantsDB: updates to the database framework for comparative plant genome research","volume":"44","author":"Spannagl","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023062312042251700_btaa519-B46","doi-asserted-by":"crossref","first-page":"7002","DOI":"10.1093\/nar\/gkp759","article-title":"Fine-grained annotation and classification of de novo predicted LTR retrotransposons","volume":"37","author":"Steinbiss","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023062312042251700_btaa519-B47","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1016\/j.molp.2019.02.008","article-title":"TIR-Learner, a new ensemble method for TIR transposable element annotation, provides evidence for abundant new transposable elements in the maize genome","volume":"12","author":"Su","year":"2019","journal-title":"Mol. Plant"},{"key":"2023062312042251700_btaa519-B48","doi-asserted-by":"crossref","first-page":"973","DOI":"10.1038\/nrg2165","article-title":"A unified classification system for eukaryotic transposable elements","volume":"8","author":"Wicker","year":"2007","journal-title":"Nat. Rev. Genet"},{"key":"2023062312042251700_btaa519-B49","doi-asserted-by":"crossref","first-page":"W265","DOI":"10.1093\/nar\/gkm286","article-title":"LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons","volume":"35","author":"Xu","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023062312042251700_btaa519-B50","doi-asserted-by":"crossref","first-page":"19688","DOI":"10.1038\/srep19688","article-title":"detectMITE: a novel approach to detect miniature inverted repeat transposable elements in genomes","volume":"6","author":"Ye","year":"2016","journal-title":"Sci. Rep"},{"key":"2023062312042251700_btaa519-B51","doi-asserted-by":"crossref","first-page":"402","DOI":"10.3389\/fpls.2017.00402","article-title":"LTRtype, an efficient tool to characterize structurally complex LTR retrotransposons and nested insertions on genomes","volume":"8","author":"Zeng","year":"2017","journal-title":"Front. Plant Sci"},{"key":"2023062312042251700_btaa519-B52","doi-asserted-by":"crossref","first-page":"i121","DOI":"10.1093\/bioinformatics\/btw255","article-title":"Convolutional neural network architectures for predicting DNA\u2013protein binding","volume":"32","author":"Zeng","year":"2016","journal-title":"Bioinformatics"},{"key":"2023062312042251700_btaa519-B53","doi-asserted-by":"crossref","first-page":"931","DOI":"10.1038\/nmeth.3547","article-title":"Predicting effects of noncoding variants with deep learning\u2013based sequence model","volume":"12","author":"Zhou","year":"2015","journal-title":"Nat. Methods"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa519\/33546676\/btaa519.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/15\/4269\/50671715\/bioinformatics_36_15_4269.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/15\/4269\/50671715\/bioinformatics_36_15_4269.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,24]],"date-time":"2023-06-24T18:00:01Z","timestamp":1687629601000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/15\/4269\/5838183"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,5,16]]},"references-count":53,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2020,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa519","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.01.27.921874","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,8,1]]},"published":{"date-parts":[[2020,5,16]]}}}