{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:18Z","timestamp":1772138058900,"version":"3.50.1"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","funder":[{"name":"Intramural Research Program"},{"DOI":"10.13039\/100000092","name":"National Library of Medicine","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,6,30]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Predicting the regulatory function of non-coding DNA using only the DNA sequence continues to be a major challenge in genomics. With the advent of improved optimization algorithms, faster GPU speeds, and more intricate machine-learning libraries, hybrid convolutional and recurrent neural network architectures can be constructed and applied to extract crucial information from non-coding DNA.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Using a comparative analysis of the performance of thousands of Deep Learning architectures, we developed ChromDL, a neural network architecture combining bidirectional gated recurrent units, convolutional neural networks, and bidirectional long short-term memory units, which significantly improves upon a range of prediction metrics compared to its predecessors in transcription factor binding site, histone modification, and DNase-I hyper-sensitive site detection. Combined with a secondary model, it can be utilized for accurate classification of gene regulatory elements. The model can also detect weak transcription factor binding as compared to previously developed methods and has the potential to help delineate transcription factor binding motif specificities.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The ChromDL source code can be found at https:\/\/github.com\/chrishil1\/ChromDL.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad217","type":"journal-article","created":{"date-parts":[[2023,5,24]],"date-time":"2023-05-24T17:10:48Z","timestamp":1684948248000},"page":"i377-i385","source":"Crossref","is-referenced-by-count":2,"title":["ChromDL: a next-generation regulatory DNA classifier"],"prefix":"10.1093","volume":"39","author":[{"given":"Christopher","family":"Hill","sequence":"first","affiliation":[{"name":"Computational Biology Branch, Intramural Research Program, National Library of Medicine, National Institutes of Health , Bethesda, MD 20892, United States"},{"name":"School of Engineering and Applied Science, University of Pennsylvania , Philadelphia, PA 19104, United States"}]},{"given":"Sanjarbek","family":"Hudaiberdiev","sequence":"additional","affiliation":[{"name":"Computational Biology Branch, Intramural Research Program, National Library of Medicine, National Institutes of Health , Bethesda, MD 20892, United States"}]},{"given":"Ivan","family":"Ovcharenko","sequence":"additional","affiliation":[{"name":"Computational Biology Branch, Intramural Research Program, National Library of Medicine, National Institutes of Health , Bethesda, MD 20892, United States"}]}],"member":"286","published-online":{"date-parts":[[2023,6,30]]},"reference":[{"key":"2023063008163087400_btad217-B1","author":"Abadi","year":"2016"},{"key":"2023063008163087400_btad217-B2","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1093\/bioinformatics\/btx583","article-title":"SNPDelScore: combining multiple methods to score deleterious effects of noncoding mutations in the human genome","volume":"34","author":"Alvarez","year":"2018","journal-title":"Bioinformatics"},{"key":"2023063008163087400_btad217-B3","doi-asserted-by":"crossref","first-page":"878","DOI":"10.15252\/msb.20156651","article-title":"Deep learning for computational biology","volume":"12","author":"Angermueller","year":"2016","journal-title":"Mol Syst Biol"},{"key":"2023063008163087400_btad217-B4","doi-asserted-by":"crossref","first-page":"W202","DOI":"10.1093\/nar\/gkp335","article-title":"MEME SUITE: tools for motif discovery and searching","volume":"37","author":"Bailey","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023063008163087400_btad217-B5","first-page":"28","article-title":"Fitting a mixture model by expectation maximization to discover motifs in biopolymers","volume":"2","author":"Bailey","year":"1994","journal-title":"Proc Int Conf Intell Syst Mol Biol"},{"key":"2023063008163087400_btad217-B6","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1109\/72.279181","article-title":"Learning long-term dependencies with gradient descent is difficult","volume":"5","author":"Bengio","year":"1994","journal-title":"IEEE Trans Neural Netw"},{"key":"2023063008163087400_btad217-B7","author":"Cho","year":"2014"},{"key":"2023063008163087400_btad217-B8","author":"Chollet","year":"2015"},{"key":"2023063008163087400_btad217-B9","doi-asserted-by":"crossref","first-page":"D794","DOI":"10.1093\/nar\/gkx1081","article-title":"The encyclopedia of DNA elements (ENCODE): data portal update","volume":"46","author":"Davis","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023063008163087400_btad217-B10","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nature11247","article-title":"An integrated encyclopedia of DNA elements in the human genome","volume":"489","author":"ENCODE Project Consortium","year":"2012","journal-title":"Nature"},{"key":"2023063008163087400_btad217-B11","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1038\/s41576-019-0122-6","article-title":"Deep learning: new computational modelling techniques for genomics","volume":"20","author":"Eraslan","year":"2019","journal-title":"Nat Rev Genet"},{"key":"2023063008163087400_btad217-B12","doi-asserted-by":"crossref","first-page":"2451","DOI":"10.1162\/089976600300015015","article-title":"Learning to forget: continual prediction with LSTM","volume":"12","author":"Gers","year":"2000","journal-title":"Neural Comput"},{"key":"2023063008163087400_btad217-B13","first-page":"273","author":"Graves","year":"2013"},{"key":"2023063008163087400_btad217-B14","doi-asserted-by":"crossref","first-page":"R24","DOI":"10.1186\/gb-2007-8-2-r24","article-title":"Quantifying similarity between motifs","volume":"8","author":"Gupta","year":"2007","journal-title":"Genome Biol"},{"key":"2023063008163087400_btad217-B15","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1016\/j.molcel.2010.05.004","article-title":"Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities","volume":"38","author":"Heinz","year":"2010","journal-title":"Mol Cell"},{"key":"2023063008163087400_btad217-B16","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput"},{"key":"2023063008163087400_btad217-B17","article-title":"Modeling islet enhancers using deep learning identifies candidate causal variants at loci associated with T2D and glycemic traits","author":"Hudaiberdiev","year":"2022","journal-title":"medRxiv"},{"key":"2023063008163087400_btad217-B18","doi-asserted-by":"crossref","first-page":"D117","DOI":"10.1093\/nar\/gku1045","article-title":"UniPROBE, update 2015: new tools and content for the online database of protein-binding microarray data on protein-DNA interactions","volume":"43","author":"Hume","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023063008163087400_btad217-B19","doi-asserted-by":"crossref","first-page":"860","DOI":"10.1038\/35057062","article-title":"Initial sequencing and analysis of the human genome","volume":"409","author":"International Human Genome Sequencing Consortium","year":"2001","journal-title":"Nature"},{"key":"2023063008163087400_btad217-B20","doi-asserted-by":"crossref","first-page":"996","DOI":"10.1101\/gr.229102","article-title":"The human genome browser at UCSC","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res"},{"key":"2023063008163087400_btad217-B21","author":"Kingma","year":"2014"},{"key":"2023063008163087400_btad217-B22","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"Imagenet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun ACM"},{"key":"2023063008163087400_btad217-B23","doi-asserted-by":"crossref","first-page":"1262","DOI":"10.1016\/j.cell.2020.02.031","article-title":"Comprehensive in vivo interrogation reveals phenotypic impact of human enhancer variants","volume":"180","author":"Kvon","year":"2020","journal-title":"Cell"},{"key":"2023063008163087400_btad217-B24","doi-asserted-by":"crossref","first-page":"1595","DOI":"10.1101\/gr.173518.114","article-title":"High-throughput functional testing of ENCODE segmentation predictions","volume":"24","author":"Kwasnieski","year":"2014","journal-title":"Genome Res"},{"key":"2023063008163087400_btad217-B25","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"2023063008163087400_btad217-B26","first-page":"2307","article-title":"Quantifying deleterious effects of regulatory variants","volume":"45","author":"Li","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023063008163087400_btad217-B27","article-title":"De novo human brain enhancers created by single nucleotide mutations","author":"Li"},{"key":"2023063008163087400_btad217-B28","doi-asserted-by":"crossref","first-page":"2161","DOI":"10.1093\/molbev\/msv118","article-title":"Human enhancers are fragile and prone to deactivating mutations","volume":"32","author":"Li","year":"2015","journal-title":"Mol Biol Evol"},{"key":"2023063008163087400_btad217-B29","doi-asserted-by":"crossref","first-page":"D110","DOI":"10.1093\/nar\/gkv1176","article-title":"JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles","volume":"44","author":"Mathelier","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023063008163087400_btad217-B30","first-page":"851","article-title":"Deep learning in bioinformatics","volume":"18","author":"Min","year":"2017","journal-title":"Brief Bioinform"},{"key":"2023063008163087400_btad217-B31","doi-asserted-by":"crossref","first-page":"D1188","DOI":"10.1093\/nar\/gkac1072","article-title":"The UCSC genome browser database: 2023 update","volume":"51","author":"Nassar","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2023063008163087400_btad217-B32","doi-asserted-by":"crossref","first-page":"e107","DOI":"10.1093\/nar\/gkw226","article-title":"DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences","volume":"44","author":"Quang","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023063008163087400_btad217-B33","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1093\/bioinformatics\/btq033","article-title":"BEDTools: a flexible suite of utilities for comparing genomic features","volume":"26","author":"Quinlan","year":"2010","journal-title":"Bioinformatics"},{"key":"2023063008163087400_btad217-B34","author":"Reddi","year":"2019"},{"key":"2023063008163087400_btad217-B35","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1038\/nature14248","article-title":"Integrative analysis of 111 reference human epigenomes","volume":"518","author":"Roadmap Epigenomics Consortium","year":"2015","journal-title":"Nature"},{"key":"2023063008163087400_btad217-B36","doi-asserted-by":"crossref","first-page":"D56","DOI":"10.1093\/nar\/gks1172","article-title":"ENCODE data in the UCSC genome browser: year 5 update","volume":"41","author":"Rosenbloom","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023063008163087400_btad217-B37","doi-asserted-by":"crossref","first-page":"1160","DOI":"10.1038\/s41588-019-0455-2","article-title":"High-throughput identification of human SNPs affecting regulatory element activity","volume":"51","author":"van Arensbergen","year":"2019","journal-title":"Nat Genet"},{"key":"2023063008163087400_btad217-B38","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1038\/s41592-019-0686-2","article-title":"SciPy 1.0: fundamental algorithms for scientific computing in python","volume":"17","author":"Virtanen","year":"2020","journal-title":"Nat Methods"},{"key":"2023063008163087400_btad217-B39","doi-asserted-by":"crossref","first-page":"1431","DOI":"10.1016\/j.cell.2014.08.009","article-title":"Determination and inference of eukaryotic transcription factor sequence specificity","volume":"158","author":"Weirauch","year":"2014","journal-title":"Cell"},{"key":"2023063008163087400_btad217-B40","doi-asserted-by":"crossref","first-page":"931","DOI":"10.1038\/nmeth.3547","article-title":"Predicting effects of noncoding variants with deep learning\u2013based sequence model","volume":"12","author":"Zhou","year":"2015","journal-title":"Nat Methods"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/Supplement_1\/i377\/50741898\/btad217.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/Supplement_1\/i377\/50741898\/btad217.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,30]],"date-time":"2023-06-30T04:20:09Z","timestamp":1688098809000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/39\/Supplement_1\/i377\/7210509"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,1]]},"references-count":40,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2023,6,30]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad217","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2023.01.27.525971","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,6,1]]},"published":{"date-parts":[[2023,6,1]]}}}