{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,22]],"date-time":"2026-02-22T06:24:54Z","timestamp":1771741494698,"version":"3.50.1"},"reference-count":185,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2022,1,11]],"date-time":"2022-01-11T00:00:00Z","timestamp":1641859200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62071079"],"award-info":[{"award-number":["62071079"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Health and Medical Research Council of Australia","award":["APP1127948"],"award-info":[{"award-number":["APP1127948"]}]},{"name":"National Health and Medical Research Council of Australia","award":["APP1144652"],"award-info":[{"award-number":["APP1144652"]}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01 AI111965"],"award-info":[{"award-number":["R01 AI111965"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,3,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Promoters are crucial regulatory DNA regions for gene transcriptional activation. Rapid advances in next-generation sequencing technologies have accelerated the accumulation of genome sequences, providing increased training data to inform computational approaches for both prokaryotic and eukaryotic promoter prediction. However, it remains a significant challenge to accurately identify species-specific promoter sequences using computational approaches. To advance computational support for promoter prediction, in this study, we curated 58 comprehensive, up-to-date, benchmark datasets for 7 different species (i.e. Escherichia coli, Bacillus subtilis, Homo sapiens, Mus musculus, Arabidopsis thaliana, Zea mays and Drosophila melanogaster) to assist the research community to assess the relative functionality of alternative approaches and support future research on both prokaryotic and eukaryotic promoters. We revisited 106 predictors published since 2000 for promoter identification (40 for prokaryotic promoter, 61 for eukaryotic promoter, and 5 for both). We systematically evaluated their training datasets, computational methodologies, calculated features, performance and software usability. On the basis of these benchmark datasets, we benchmarked 19 predictors with functioning webservers\/local tools and assessed their prediction performance. We found that deep learning and traditional machine learning\u2013based approaches generally outperformed scoring function\u2013based approaches. Taken together, the curated benchmark dataset repository and the benchmarking analysis in this study serve to inform the design and implementation of computational approaches for promoter prediction and facilitate more rigorous comparison of new techniques in the future.<\/jats:p>","DOI":"10.1093\/bib\/bbab551","type":"journal-article","created":{"date-parts":[[2021,12,1]],"date-time":"2021-12-01T12:18:05Z","timestamp":1638361085000},"source":"Crossref","is-referenced-by-count":24,"title":["Critical assessment of computational tools for prokaryotic and eukaryotic promoter prediction"],"prefix":"10.1093","volume":"23","author":[{"given":"Meng","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Science, Dalian Maritime University, Dalian 116026, China"}]},{"given":"Cangzhi","family":"Jia","sequence":"additional","affiliation":[{"name":"School of Science, Dalian Maritime University, Dalian 116026, China"}]},{"given":"Fuyi","family":"Li","sequence":"additional","affiliation":[{"name":"Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia"},{"name":"The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, VIC, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1847-754X","authenticated-orcid":false,"given":"Chen","family":"Li","sequence":"additional","affiliation":[{"name":"Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia"}]},{"given":"Yan","family":"Zhu","sequence":"additional","affiliation":[{"name":"School of Science, Dalian Maritime University, Dalian 116026, China"}]},{"given":"Tatsuya","family":"Akutsu","sequence":"additional","affiliation":[{"name":"Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan"}]},{"given":"Geoffrey I","family":"Webb","sequence":"additional","affiliation":[{"name":"Department of Data Science and Artificial Intelligence, Monash University, Melbourne, VIC 3800, Australia"},{"name":"Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia"}]},{"given":"Quan","family":"Zou","sequence":"additional","affiliation":[{"name":"Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China"}]},{"given":"Lachlan J M","family":"Coin","sequence":"additional","affiliation":[{"name":"The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, VIC, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8031-9086","authenticated-orcid":false,"given":"Jiangning","family":"Song","sequence":"additional","affiliation":[{"name":"Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia"},{"name":"Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia"}]}],"member":"286","published-online":{"date-parts":[[2022,1,11]]},"reference":[{"key":"2022031506262261400_ref1","doi-asserted-by":"crossref","first-page":"2583","DOI":"10.1101\/gad.1026202","article-title":"The RNA polymerase II core promoter: a key component in the regulation of gene expression","volume":"16","author":"Butler","year":"2002","journal-title":"Genes Dev"},{"key":"2022031506262261400_ref2","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1007\/s003359900963","article-title":"Models for prediction and recognition of eukaryotic promoters","volume":"10","author":"Werner","year":"1999","journal-title":"Mamm Genome"},{"key":"2022031506262261400_ref3","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1016\/j.ydbio.2009.08.009","article-title":"Regulation of gene expression via the core promoter and the basal transcriptional machinery","volume":"339","author":"Juven-Gershon","year":"2010","journal-title":"Dev Biol"},{"key":"2022031506262261400_ref4","doi-asserted-by":"crossref","first-page":"946","DOI":"10.1093\/bib\/bbz045","article-title":"Transcription factors-DNA interactions in rice: identification and verification","volume":"21","author":"Shen","year":"2020","journal-title":"Brief Bioinform"},{"key":"2022031506262261400_ref5","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nrmicro787","article-title":"The regulation of bacterial transcription initiation","volume":"2","author":"Browning","year":"2004","journal-title":"Nat Rev Microbiol"},{"key":"2022031506262261400_ref6","doi-asserted-by":"crossref","first-page":"839","DOI":"10.1146\/annurev.bi.57.070188.004203","article-title":"Structure and function of bacterial sigma factors","volume":"57","author":"Helmann","year":"1988","journal-title":"Annu Rev Biochem"},{"key":"2022031506262261400_ref7","doi-asserted-by":"crossref","first-page":"2237","DOI":"10.1093\/nar\/11.8.2237","article-title":"Compilation and analysis of Escherichia coli promoter DNA sequences","volume":"11","author":"Hawley","year":"1983","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref8","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1146\/annurev.bi.65.070196.004005","article-title":"Biochemistry and structural biology of transcription factor IID (TFIID)","volume":"65","author":"Burley","year":"1996","journal-title":"Annu Rev Biochem"},{"key":"2022031506262261400_ref9","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1101\/sqb.1998.63.21","article-title":"The initiator element: a paradigm for core promoter heterogeneity within metazoan protein-coding genes","volume":"63","author":"Smale","year":"1998","journal-title":"Cold Spring Harb Symp Quant Biol"},{"key":"2022031506262261400_ref10","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1016\/S0955-0674(97)80002-6","article-title":"RNA polymerase II holoenzyme and transcriptional regulation","volume":"9","author":"Greenblatt","year":"1997","journal-title":"Curr Opin Cell Biol"},{"key":"2022031506262261400_ref11","doi-asserted-by":"crossref","first-page":"19962","DOI":"10.1016\/S0021-9258(17)32114-2","article-title":"Topological localization of the human transcription factors IIA, IIB, TATA box-binding protein, and RNA polymerase II-associated protein 30 on a class II promoter","volume":"269","author":"Coulombe","year":"1994","journal-title":"J Biol Chem"},{"key":"2022031506262261400_ref12","doi-asserted-by":"crossref","first-page":"6275","DOI":"10.1073\/pnas.0508169103","article-title":"DNA motifs in human and mouse proximal promoters predict tissue-specific expression","volume":"103","author":"Smith","year":"2006","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2022031506262261400_ref13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-8-S6-S3","article-title":"Computational analyses of eukaryotic promoters","volume":"8","author":"Zhang","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2022031506262261400_ref14","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0187243","article-title":"Nucleotide patterns aiding in prediction of eukaryotic promoters","volume":"12","author":"Triska","year":"2017","journal-title":"Plos One"},{"key":"2022031506262261400_ref15","doi-asserted-by":"crossref","first-page":"1273","DOI":"10.1101\/gr.1119703","article-title":"Targeting a complex transcriptome: The construction of the mouse full-length cDNA encyclopedia","volume":"13","author":"Carninci","year":"2003","journal-title":"Genome Res"},{"key":"2022031506262261400_ref16","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1007\/s11103-008-9415-4","article-title":"Insights into corn genes derived from large-scale cDNA sequencing","volume":"69","author":"Alexandrov","year":"2009","journal-title":"Plant Mol Biol"},{"key":"2022031506262261400_ref17","doi-asserted-by":"crossref","first-page":"15776","DOI":"10.1073\/pnas.2136655100","article-title":"Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage","volume":"100","author":"Shiraki","year":"2003","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2022031506262261400_ref18","doi-asserted-by":"crossref","first-page":"2746","DOI":"10.1105\/tpc.114.125617","article-title":"Paired-end analysis of transcription start sites in arabidopsis reveals plant-specific promoter signatures","volume":"26","author":"Morton","year":"2014","journal-title":"Plant Cell"},{"key":"2022031506262261400_ref19","doi-asserted-by":"crossref","first-page":"Unit 25B.11","DOI":"10.1002\/0471142727.mb25b11s104","article-title":"RAMPAGE: promoter activity profiling by paired-end sequencing of 5\u2032-complete cDNAs","volume":"104","author":"Batut","year":"2013","journal-title":"Curr Protoc Mol Biol"},{"key":"2022031506262261400_ref20","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1101\/gr.7.9.861","article-title":"Eukaryotic promoter recognition","volume":"7","author":"Fickett","year":"1997","journal-title":"Genome Res"},{"key":"2022031506262261400_ref21","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1016\/S0097-8485(99)00015-7","article-title":"The biology of eukaryotic promoter prediction\u2014a review","volume":"23","author":"Pedersen","year":"1999","journal-title":"Comput Chem"},{"key":"2022031506262261400_ref22","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1016\/S0168-9525(00)02174-0","article-title":"Identification and analysis of eukaryotic promoters: recent computational approaches","volume":"17","author":"Ohler","year":"2001","journal-title":"Trends Genet"},{"key":"2022031506262261400_ref23","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1093\/bib\/4.1.22","article-title":"The state of the art of mammalian promoter recognition","volume":"4","author":"Werner","year":"2003","journal-title":"Brief Bioinform"},{"key":"2022031506262261400_ref24","doi-asserted-by":"crossref","first-page":"1467","DOI":"10.1038\/nbt1032","article-title":"Promoter prediction analysis on the whole human genome","volume":"22","author":"Bajic","year":"2004","journal-title":"Nat Biotechnol"},{"key":"2022031506262261400_ref25","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2006-7-s1-s3","article-title":"Performance assessment of promoter predictions on ENCODE regions in the EGASP experiment","volume":"7","author":"Bajic","year":"2006","journal-title":"Genome Biol"},{"key":"2022031506262261400_ref26","doi-asserted-by":"crossref","first-page":"I313","DOI":"10.1093\/bioinformatics\/btp191","article-title":"Toward a gold standard for promoter prediction evaluation","volume":"25","author":"Abeel","year":"2009","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref27","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1093\/bib\/bbp027","article-title":"Towards accurate human promoter recognition: a review of currently used sequence features and classification methods","volume":"10","author":"Zeng","year":"2009","journal-title":"Brief Bioinform"},{"key":"2022031506262261400_ref28","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1080\/15257770.2015.1013126","article-title":"A review of computational intelligence methods for eukaryotic promoter prediction","volume":"34","author":"Singh","year":"2015","journal-title":"Nucleosides Nucleotides Nucleic Acids"},{"key":"2022031506262261400_ref29","article-title":"TSSPlant: a new tool for prediction of plant Pol II promoters","volume":"45","author":"Shahmuradov","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref30","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0171410","article-title":"Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks","volume":"12","author":"Umarov","year":"2017","journal-title":"Plos One"},{"key":"2022031506262261400_ref31","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1016\/j.omtn.2019.05.028","article-title":"iProEP: a computational predictor for predicting promoter","volume":"17","author":"Lai","year":"2019","journal-title":"Mol Ther Nucleic Acids"},{"key":"2022031506262261400_ref32","doi-asserted-by":"crossref","DOI":"10.3389\/fgene.2019.00286","article-title":"DeePromoter: robust promoter predictor using deep learning","volume":"10","author":"Oubounyt","year":"2019","journal-title":"Front Genet"},{"key":"2022031506262261400_ref33","doi-asserted-by":"crossref","first-page":"1964","DOI":"10.1093\/bioinformatics\/btg265","article-title":"Sequence alignment kernel for recognition of promoter regions","volume":"19","author":"Gordon","year":"2003","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref34","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1016\/j.jmb.2003.07.017","article-title":"Sigma70 promoters in Escherichia coli: specific transcription in dense regions of overlapping promoter-like signals","volume":"333","author":"Huerta","year":"2003","journal-title":"J Mol Biol"},{"key":"2022031506262261400_ref35","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1093\/bioinformatics\/bti047","article-title":"Improving promoter prediction improving promoter prediction for the NNPP2.2 algorithm: a case study using Escherichia coli DNA sequences","volume":"21","author":"Burden","year":"2005","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref36","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1007\/11532323_9","volume-title":"Advances in Bioinformatics and Computational Biology, Proceedings","author":"Monteiro","year":"2005"},{"key":"2022031506262261400_ref37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-6-1","article-title":"A novel method for prokaryotic promoter prediction based on DNA stability","volume":"6","author":"Kanhere","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2022031506262261400_ref38","first-page":"2319","volume-title":"2006 IEEE International Joint Conference on Neural Network Proceedings","author":"Silva","year":"2006"},{"key":"2022031506262261400_ref39","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkl1024","article-title":"A pHMM-ANN based discriminative approach to promoter identification in prokaryote genomic contexts","volume":"35","author":"Mann","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-7-248","article-title":"Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress","volume":"7","author":"Wang","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2022031506262261400_ref41","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1093\/bioinformatics\/bti771","article-title":"Improved prediction of bacterial transcription start sites","volume":"22","author":"Gordon","year":"2006","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref42","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1142\/S0129065706000767","article-title":"The prediction of bacterial transcription start sites using SVMs","volume":"16","author":"Towsey","year":"2006","journal-title":"Int J Neural Syst"},{"key":"2022031506262261400_ref43","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1016\/j.jtbi.2006.02.007","article-title":"The recognition and prediction of Sigma(70) promoters in Escherichia coli K-12","volume":"242","author":"Li","year":"2006","journal-title":"J Theor Biol"},{"key":"2022031506262261400_ref44","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1142\/9781860949852_0016","volume-title":"Genome Informatics 2007: Genome Informatics Series","author":"Towsey","year":"2007"},{"key":"2022031506262261400_ref45","doi-asserted-by":"crossref","first-page":"685","DOI":"10.1016\/j.resmic.2007.08.005","article-title":"Genome-wide analysis of chlamydiae for promoters that phylogenetically footprint","volume":"158","author":"Grech","year":"2007","journal-title":"Res Microbiol"},{"key":"2022031506262261400_ref46","doi-asserted-by":"crossref","first-page":"851","DOI":"10.1007\/s12038-007-0085-1","article-title":"Identification and annotation of promoter regions in microbial genome sequences on the basis of DNA stability","volume":"32","author":"Rangannan","year":"2007","journal-title":"J Biosci"},{"key":"2022031506262261400_ref47","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1016\/j.compbiolchem.2008.07.009","article-title":"The cross-species prediction of bacterial promoters using a support vector machine","volume":"32","author":"Towsey","year":"2008","journal-title":"Comput Biol Chem"},{"key":"2022031506262261400_ref48","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1266\/ggs.84.425","article-title":"N4: a precise and highly sensitive promoter predictor using neural network fed by nearest neighbors","volume":"84","author":"Askary","year":"2009","journal-title":"Genes Genet Syst"},{"key":"2022031506262261400_ref49","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/j.eswa.2007.09.010","article-title":"A new method to forecast of Escherichia coli promoter gene sequences: integrating feature selection and Fuzzy-AIRS classifier system","volume":"36","author":"Polat","year":"2009","journal-title":"Expert Syst Appl"},{"key":"2022031506262261400_ref50","doi-asserted-by":"crossref","first-page":"1758","DOI":"10.1039\/b906535k","article-title":"Relative stability of DNA as a generic criterion for promoter prediction: whole genome annotation of microbial genomes with varying nucleotide base composition","volume":"5","author":"Rangannan","year":"2009","journal-title":"Mol Biosyst"},{"key":"2022031506262261400_ref51","doi-asserted-by":"crossref","first-page":"3043","DOI":"10.1093\/bioinformatics\/btq577","article-title":"High-quality annotation of promoter regions for 913 bacterial genomes","volume":"26","author":"Rangannan","year":"2010","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref52","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1016\/j.jtbi.2011.07.017","article-title":"BacPP: Bacterial promoter prediction\u2014a tool for accurate sigma-factor specific assignment in enterobacteria","volume":"287","author":"Avila e Silva","year":"2011","journal-title":"J Theor Biol"},{"key":"2022031506262261400_ref53","doi-asserted-by":"crossref","first-page":"963","DOI":"10.1093\/nar\/gkr795","article-title":"Recognition of prokaryotic promoters based on a novel variable-window Z-curve method","volume":"40","author":"Song","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref54","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0045097","article-title":"Genome-wide prediction and validation of Sigma70 promoters in Lactobacillus plantarum WCFS1","volume":"7","author":"Todt","year":"2012","journal-title":"Plos One"},{"key":"2022031506262261400_ref55","doi-asserted-by":"crossref","first-page":"12961","DOI":"10.1093\/nar\/gku1019","article-title":"iPro54-PseKNC: a sequence-based predictor for identifying Sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition","volume":"42","author":"Lin","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref56","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1016\/j.biologicals.2013.10.001","article-title":"DNA duplex stability as discriminative characteristic for Escherichia coli Sigma(54)- and Sigma(28)-dependent promoter sequences","volume":"42","author":"Avila e Silva","year":"2014","journal-title":"Biologicals"},{"key":"2022031506262261400_ref57","doi-asserted-by":"crossref","first-page":"334","DOI":"10.1093\/bioinformatics\/btw629","article-title":"bTSSfinder: a novel tool for the prediction of promoters in cyanobacteria and Escherichia coli","volume":"33","author":"Shahmuradov","year":"2017","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref58","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1093\/bioinformatics\/btx579","article-title":"iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC","volume":"34","author":"Liu","year":"2018","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref59","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1186\/s12918-018-0570-1","article-title":"70ProPred: a predictor for discovering Sigma70 promoters based on combining multiple features","volume":"12","author":"He","year":"2018","journal-title":"BMC Syst Biol"},{"key":"2022031506262261400_ref60","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-018-36308-0","article-title":"Image-based promoter prediction: a promoter prediction method based on evolutionarily generated patterns","volume":"8","author":"Wang","year":"2018","journal-title":"Sci Rep"},{"key":"2022031506262261400_ref61","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1016\/j.dib.2018.05.025","article-title":"Bacillus subtilis promoter sequences data set for promoter prediction in Gram-positive bacteria","volume":"19","author":"Coelho","year":"2018","journal-title":"Data Brief"},{"key":"2022031506262261400_ref62","doi-asserted-by":"crossref","first-page":"1316","DOI":"10.1109\/TCBB.2017.2666141","article-title":"Identifying Sigma70 promoters with novel pseudo nucleotide composition","volume":"16","author":"Lin","year":"2019","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2022031506262261400_ref63","doi-asserted-by":"crossref","first-page":"1160","DOI":"10.1016\/j.ygeno.2018.07.011","article-title":"iPromoter-FSEn: identification of bacterial Sigma(70) promoter sequences using feature subspace based ensemble classifier","volume":"111","author":"Rahman","year":"2019","journal-title":"Genomics"},{"key":"2022031506262261400_ref64","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1007\/s00438-018-1487-5","article-title":"iPro70-FMWin: identifying Sigma70 promoters using multiple windowing and minimal features","volume":"294","author":"Rahman","year":"2019","journal-title":"Mol Genet Genomics"},{"key":"2022031506262261400_ref65","doi-asserted-by":"crossref","first-page":"1785","DOI":"10.1016\/j.ygeno.2018.12.001","article-title":"iPSW(2L)-PseKNC: a two-layer predictor for identifying promoters and their strength by hybrid features via pseudo K-tuple nucleotide composition","volume":"111","author":"Xiao","year":"2019","journal-title":"Genomics"},{"key":"2022031506262261400_ref66","doi-asserted-by":"crossref","first-page":"2957","DOI":"10.1093\/bioinformatics\/btz016","article-title":"MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters","volume":"35","author":"Zhang","year":"2019","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref67","article-title":"Classifying promoters by interpreting the hidden information of DNA sequences via deep learning and combination of continuous fasttext N-grams","volume":"7","author":"Nguyen Quoc Khanh","year":"2019","journal-title":"Front Bioeng Biotechnol"},{"key":"2022031506262261400_ref68","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1016\/j.omtn.2019.08.008","article-title":"iPromoter-2L2.0: identifying promoters and their types by combining smoothing cutting window algorithm and sequence-based features","volume":"18","author":"Liu","year":"2019","journal-title":"Mol Ther Nucleic Acids"},{"key":"2022031506262261400_ref69","article-title":"Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework","volume":"22","author":"Li","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022031506262261400_ref70","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btaa609","article-title":"iPromoter-BnCNN: a novel branched CNN based predictor for identifying and classifying sigma promoters","volume":"36","author":"Amin","year":"2020","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref71","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-018-2049-x","article-title":"G4PromFinder: an algorithm for predicting transcription promoters in GC-rich bacterial genomes based on AT-rich elements and G-quadruplex motifs","volume":"19","author":"Di Salvo","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2022031506262261400_ref72","first-page":"9","volume-title":"Proceedings of the 2006 Workshop on Intelligent Systems for Bioinformatics","author":"Maetschke","year":"2006"},{"key":"2022031506262261400_ref73","doi-asserted-by":"crossref","first-page":"599","DOI":"10.1006\/jmbi.2000.3589","article-title":"Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: a novel context analysis approach","volume":"297","author":"Scherf","year":"2000","journal-title":"J Mol Biol"},{"key":"2022031506262261400_ref74","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1038\/79189","article-title":"Large-scale human promoter mapping using CpG islands","volume":"26","author":"Ioshikhes","year":"2000","journal-title":"Nat Genet"},{"key":"2022031506262261400_ref75","first-page":"380","article-title":"Stochastic segment models of eukaryotic promoter regions","author":"Ohler","year":"2000","journal-title":"Pac Symp Biocomput"},{"issue":"Suppl 1","key":"2022031506262261400_ref76","doi-asserted-by":"crossref","first-page":"S199","DOI":"10.1093\/bioinformatics\/17.suppl_1.S199","article-title":"Joint modeling of DNA sequence and physical properties to improve eukaryotic promoter recognition","volume":"17","author":"Ohler","year":"2001","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2022031506262261400_ref77","doi-asserted-by":"crossref","first-page":"412","DOI":"10.1038\/ng780","article-title":"Computational identification of promoters and first exons in the human genome","volume":"29","author":"Davuluri","year":"2001","journal-title":"Nat Genet"},{"issue":"Suppl 1","key":"2022031506262261400_ref78","doi-asserted-by":"crossref","first-page":"S90","DOI":"10.1093\/bioinformatics\/17.suppl_1.S90","article-title":"Promoter prediction in the human genome","volume":"17","author":"Hannenhalli","year":"2001","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2022031506262261400_ref79","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1016\/S0097-8485(01)00099-7","article-title":"Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome","volume":"26","author":"Reese","year":"2001","journal-title":"Comput Chem"},{"key":"2022031506262261400_ref80","doi-asserted-by":"crossref","first-page":"826","DOI":"10.1023\/A:1013278000196","article-title":"Computer analysis and recognition of Drosophila melanogaster gene promoters","volume":"35","author":"Levitsky","year":"2001","journal-title":"Mol Biol"},{"key":"2022031506262261400_ref81","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1101\/gr.216102","article-title":"Computational detection and location of transcription start sites in mammalian genomic DNA","volume":"12","author":"Down","year":"2002","journal-title":"Genome Res"},{"key":"2022031506262261400_ref82","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1093\/bioinformatics\/18.4.631","article-title":"CpGProD: identifying CpG islands associated with transcription start sites in large genomic mammalian sequences","volume":"18","author":"Ponger","year":"2002","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref83","doi-asserted-by":"crossref","first-page":"RESEARCH0087","DOI":"10.1186\/gb-2002-3-12-research0087","article-title":"Computational analysis of core promoters in the Drosophila genome","volume":"3","author":"Ohler","year":"2002","journal-title":"Genome Biol"},{"key":"2022031506262261400_ref84","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1093\/bioinformatics\/18.1.198","article-title":"Dragon promoter finder: recognition of vertebrate RNA polymerase II promoters","volume":"18","author":"Bajic","year":"2002","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref85","doi-asserted-by":"crossref","first-page":"462","DOI":"10.1101\/gr.198002","article-title":"Consensus promoter identification in the human genome utilizing expressed gene markers and gene modeling","volume":"12","author":"Liu","year":"2002","journal-title":"Genome Res"},{"key":"2022031506262261400_ref86","doi-asserted-by":"crossref","first-page":"3554","DOI":"10.1093\/nar\/gkg549","article-title":"PromoSer: a large-scale mammalian promoter and transcription start site identification service","volume":"31","author":"Halees","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref87","volume-title":"Methods in enzymology","author":"Bajic","year":"2003"},{"key":"2022031506262261400_ref88","doi-asserted-by":"crossref","first-page":"3540","DOI":"10.1093\/nar\/gkg525","article-title":"PromH: promoters identification using orthologous genomic sequences","volume":"31","author":"Solovyev","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref89","doi-asserted-by":"crossref","first-page":"1923","DOI":"10.1101\/gr.869803","article-title":"Dragon gene start finder: an advanced system for finding approximate locations of the start of gene transcriptional units","volume":"13","author":"Bajic","year":"2003","journal-title":"Genome Res"},{"key":"2022031506262261400_ref90","first-page":"81","article-title":"Recognition of eukaryotic promoters using a genetic algorithm based on iterative discriminant analysis","volume":"3","author":"Levitsky","year":"2003","journal-title":"In Silico Biol"},{"key":"2022031506262261400_ref91","first-page":"1","volume-title":"International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003","author":"Kasabov","year":"2003"},{"key":"2022031506262261400_ref92","doi-asserted-by":"crossref","first-page":"250","DOI":"10.1093\/abbs\/36.4.250","article-title":"Predicting polymerase II core promoters by cooperating transcription factor binding sites in eukaryotic genes","volume":"36","author":"Ma","year":"2004","journal-title":"Acta Biochim Biophys Sin"},{"key":"2022031506262261400_ref93","doi-asserted-by":"crossref","first-page":"1332","DOI":"10.1093\/nar\/gki271","article-title":"Human pol II promoter prediction: time series descriptors and machine learning","volume":"33","author":"Gangal","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref94","doi-asserted-by":"crossref","first-page":"1069","DOI":"10.1093\/nar\/gki247","article-title":"Plant promoter prediction with confidence estimation","volume":"33","author":"Shahmuradov","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref95","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1016\/j.artmed.2005.02.005","article-title":"Computational modeling of oligonucleotide positional densities for human promoter prediction","volume":"35","author":"Narang","year":"2005","journal-title":"Artif Intell Med"},{"key":"2022031506262261400_ref96","doi-asserted-by":"crossref","first-page":"2722","DOI":"10.1093\/bioinformatics\/btl482","article-title":"PromoterExplorer: an effective promoter identification method based on the AdaBoost algorithm","volume":"22","author":"Xie","year":"2006","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref97","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1016\/j.bbrc.2006.06.062","article-title":"A mammalian promoter model links cis elements to genetic networks","volume":"347","author":"Wang","year":"2006","journal-title":"Biochem Biophys Res Commun"},{"key":"2022031506262261400_ref98","doi-asserted-by":"crossref","first-page":"W578","DOI":"10.1093\/nar\/gkl193","article-title":"PromAn: an integrated knowledge-based web server dedicated to promoter analysis","volume":"34","author":"Lardenois","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref99","doi-asserted-by":"crossref","first-page":"E472","DOI":"10.1093\/bioinformatics\/btl250","article-title":"ARTS: accurate recognition of transcription starts in human","volume":"22","author":"Sonnenburg","year":"2006","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref100","doi-asserted-by":"crossref","DOI":"10.1186\/gb-2006-7-s1-s10","article-title":"Automatic annotation of eukaryotic genes, pseudogenes and promoters","volume":"7","author":"Solovyev","year":"2006","journal-title":"Genome Biol"},{"key":"2022031506262261400_ref101","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1016\/j.biosystems.2005.09.001","article-title":"Computational analysis of plant RNA Pol-II promoters","volume":"83","author":"Pandey","year":"2006","journal-title":"Biosystems"},{"key":"2022031506262261400_ref102","doi-asserted-by":"crossref","first-page":"5943","DOI":"10.1093\/nar\/gkl608","article-title":"Identification of core promoter modules in Drosophila and their application in accurate transcription start site prediction","volume":"34","author":"Ohler","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref103","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2007-8-2-r17","article-title":"Boosting with stumps for predicting transcription start sites","volume":"8","author":"Zhao","year":"2007","journal-title":"Genome Biol"},{"key":"2022031506262261400_ref104","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1142\/9781860948732_0021","article-title":"Prediction of transcription start sites based on feature selection using AMOSA","volume":"6","author":"Wang","year":"2007","journal-title":"Comput Syst Bioinformatics Conf"},{"key":"2022031506262261400_ref105","doi-asserted-by":"crossref","DOI":"10.1103\/PhysRevE.75.041908","article-title":"Eukaryotic promoter prediction based on relative entropy and positional information","volume":"75","author":"Wu","year":"2007","journal-title":"Phys Rev E"},{"key":"2022031506262261400_ref106","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2007-8-12-r263","article-title":"Determining promoter location based on DNA structure first-principles calculations","volume":"8","author":"Goni","year":"2007","journal-title":"Genome Biol"},{"key":"2022031506262261400_ref107","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2164-8-374","article-title":"MetaProm: a neural network based meta-predictor for alternative human promoter prediction","volume":"8","author":"Wang","year":"2007","journal-title":"BMC Genomics"},{"key":"2022031506262261400_ref108","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-9-414","article-title":"Pol II promoter prediction using characteristic 4-mer motifs: a machine learning approach","volume":"9","author":"Anwar","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2022031506262261400_ref109","doi-asserted-by":"crossref","first-page":"316","DOI":"10.6026\/97320630002316","article-title":"Prediction for human transcription start site using diversity measure with quadratic discriminant","volume":"2","author":"Lu","year":"2008","journal-title":"Bioinformation"},{"key":"2022031506262261400_ref110","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1101\/gr.6991408","article-title":"Generic eukaryotic core promoter prediction using structural features of DNA","volume":"18","author":"Abeel","year":"2008","journal-title":"Genome Res"},{"key":"2022031506262261400_ref111","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1016\/j.ygeno.2007.11.001","article-title":"EnsemPro: an ensemble approach to predicting transcription start sites in human genomic DNA sequences","volume":"91","author":"Won","year":"2008","journal-title":"Genomics"},{"key":"2022031506262261400_ref112","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1016\/j.gene.2007.12.011","article-title":"DNA sequence and structural properties as predictors of human and mouse promoters","volume":"410","author":"Akan","year":"2008","journal-title":"Gene"},{"key":"2022031506262261400_ref113","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-9-113","article-title":"Human Pol II promoter recognition based on primary sequences and free energy of dinucleotides","volume":"9","author":"Yang","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2022031506262261400_ref114","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1089\/omi.2008.0034","article-title":"Genome-wide discovery of cis-elements in promoter sequences using gene expression","volume":"13","author":"Troukhan","year":"2009","journal-title":"OMICS"},{"key":"2022031506262261400_ref115","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1101\/gr.081638.108","article-title":"High-resolution human core-promoter prediction with CoreBoost_HM","volume":"19","author":"Wang","year":"2009","journal-title":"Genome Res"},{"key":"2022031506262261400_ref116","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0004878","article-title":"RBF-TSS: identification of transcription start site in human using radial basis functions network and oligonucleotide positional frequencies","volume":"4","author":"Mahdi","year":"2009","journal-title":"Plos One"},{"key":"2022031506262261400_ref117","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1109\/TCBB.2008.95","article-title":"SCS: signal, context, and structure features for genome-wide human promoter recognition","volume":"7","author":"Zeng","year":"2010","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2022031506262261400_ref118","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0013934","article-title":"High sensitivity TSS prediction: estimates of locations where TSS cannot occur","volume":"5","author":"Schaefer","year":"2010","journal-title":"Plos One"},{"key":"2022031506262261400_ref119","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1748-7188-6-19","article-title":"Prediction of plant promoters based on hexamers and random triplet pair analysis","volume":"6","author":"Azad","year":"2011","journal-title":"Algorithms Mol Biol"},{"key":"2022031506262261400_ref120","doi-asserted-by":"crossref","first-page":"112","DOI":"10.1016\/j.ygeno.2010.11.002","article-title":"Identification of TATA and TATA-less promoters in plant genomes by integrating diversity measure, GC-Skew and DNA geometric flexibility","volume":"97","author":"Zuo","year":"2011","journal-title":"Genomics"},{"key":"2022031506262261400_ref121","doi-asserted-by":"crossref","first-page":"1300","DOI":"10.1104\/pp.110.167809","article-title":"DNA free energy-based promoter prediction and comparative analysis of arabidopsis and rice genomes","volume":"156","author":"Morey","year":"2011","journal-title":"Plant Physiol"},{"key":"2022031506262261400_ref122","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1109\/IWACI.2011.6160009","volume-title":"The Fourth International Workshop on Advanced Computational Intelligence","author":"Fang","year":"2011"},{"key":"2022031506262261400_ref123","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2164-13-S1-S3","article-title":"GPMiner: an integrated system for mining combinatorial cis-regulatory elements in mammalian gene group","volume":"13","author":"Lee","year":"2012","journal-title":"BMC Genomics"},{"key":"2022031506262261400_ref124","first-page":"261","article-title":"NPEST: a nonparametric method and a database for transcription start site prediction","volume":"1","author":"Tatarinova","year":"2013","journal-title":"Quant Biol (Beijing, China)"},{"key":"2022031506262261400_ref125","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1109\/TNB.2014.2327586","article-title":"ProMT: effective human promoter prediction using markov chain model based on DNA structural properties","volume":"13","author":"Xiong","year":"2014","journal-title":"IEEE Trans Nanobioscience"},{"key":"2022031506262261400_ref126","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-018-22129-8","article-title":"Identification of putative promoters in 48 eukaryotic genomes on the basis of DNA free energy","volume":"8","author":"Yella","year":"2018","journal-title":"Sci Rep"},{"key":"2022031506262261400_ref127","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1109\/COMPSAC.2018.00072","volume-title":"2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC)","author":"Qian","year":"2018"},{"key":"2022031506262261400_ref128","doi-asserted-by":"crossref","first-page":"136","DOI":"10.1109\/TNB.2019.2891239","article-title":"DCDE: an efficient deep convolutional divergence encoding method for human promoter recognition","volume":"18","author":"Xu","year":"2019","journal-title":"IEEE Trans Nanobioscience"},{"key":"2022031506262261400_ref129","doi-asserted-by":"crossref","first-page":"2730","DOI":"10.1093\/bioinformatics\/bty1068","article-title":"Promoter analysis and prediction in the human genome using sequence-based deep learning models","volume":"35","author":"Umarov","year":"2019","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref130","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbaa299","article-title":"Computational identification of eukaryotic promoters based on cascaded deep capsule neural networks","volume":"22","author":"Zhu","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022031506262261400_ref131","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0054843","article-title":"A composite method based on formal grammar and DNA structural features in detecting human polymerase II promoter region","volume":"8","author":"Datta","year":"2013","journal-title":"Plos One"},{"key":"2022031506262261400_ref132","doi-asserted-by":"crossref","first-page":"582","DOI":"10.1093\/bioinformatics\/btl670","article-title":"Analysis of E. coli promoter recognition problem in dinucleotide feature space","volume":"23","author":"Rani","year":"2007","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref133","doi-asserted-by":"crossref","first-page":"S1","DOI":"10.3233\/ISB-2009-0388","article-title":"Analysis of n-gram based promoter recognition methods and application to whole genome promoter prediction","volume":"9","author":"Rani","year":"2009","journal-title":"In Silico Biol"},{"key":"2022031506262261400_ref134","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1007\/s12064-010-0114-8","article-title":"Eukaryotic and prokaryotic promoter prediction using hybrid approach","volume":"130","author":"Lin","year":"2011","journal-title":"Theory Biosci"},{"key":"2022031506262261400_ref135","doi-asserted-by":"crossref","first-page":"D92","DOI":"10.1093\/nar\/gku1111","article-title":"The eukaryotic promoter database: expansion of EPDnew and new promoter analysis tools","volume":"43","author":"Dreos","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref136","doi-asserted-by":"crossref","first-page":"D150","DOI":"10.1093\/nar\/gkr1005","article-title":"DBTSS: DataBase of Transcriptional Start Sites progress report in 2012","volume":"40","author":"Yamashita","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref137","doi-asserted-by":"crossref","first-page":"D212","DOI":"10.1093\/nar\/gky1077","article-title":"RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K-12","volume":"47","author":"Santos-Zavaleta","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref138","doi-asserted-by":"crossref","first-page":"D93","DOI":"10.1093\/nar\/gkm910","article-title":"DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information","volume":"36","author":"Sierro","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref139","doi-asserted-by":"crossref","first-page":"D884","DOI":"10.1093\/nar\/gkaa942","article-title":"Ensembl 2021","volume":"49","author":"Howe","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref140","doi-asserted-by":"crossref","first-page":"W589","DOI":"10.1093\/nar\/gkv350","article-title":"The BioMart community portal: an innovative alternative to large, centralized data repositories","volume":"43","author":"Smedley","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref141","doi-asserted-by":"crossref","first-page":"996","DOI":"10.1101\/gr.229102","article-title":"The human genome browser at UCSC","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res"},{"key":"2022031506262261400_ref142","doi-asserted-by":"crossref","first-page":"1658","DOI":"10.1093\/bioinformatics\/btl158","article-title":"Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences","volume":"22","author":"Li","year":"2006","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref143","first-page":"1","article-title":"Sequence clustering in bioinformatics: an empirical study","volume":"21","author":"Zou","year":"2020","journal-title":"Brief Bioinform"},{"key":"2022031506262261400_ref144","doi-asserted-by":"crossref","DOI":"10.1186\/s12918-016-0353-5","article-title":"Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy","volume":"10","author":"Zou","year":"2016","journal-title":"BMC Syst Biol"},{"key":"2022031506262261400_ref145","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1146\/annurev.biochem.72.121801.161520","article-title":"The RNA polymerase II core promoter","volume":"72","author":"Smale","year":"2003","journal-title":"Annu Rev Biochem"},{"key":"2022031506262261400_ref146","doi-asserted-by":"crossref","first-page":"3740","DOI":"10.1073\/pnas.052410099","article-title":"Comprehensive analysis of CpG islands in human chromosomes 21 and 22","volume":"99","author":"Takai","year":"2002","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2022031506262261400_ref147","doi-asserted-by":"crossref","first-page":"1407","DOI":"10.1126\/science.8248780","article-title":"A third recognition element in bacterial promoters: DNA binding by the alpha subunit of RNA polymerase","volume":"262","author":"Ross","year":"1993","journal-title":"Science (New York, NY)"},{"key":"2022031506262261400_ref148","doi-asserted-by":"crossref","first-page":"2152","DOI":"10.1128\/JB.180.8.2152-2159.1998","article-title":"An AT-rich tract containing an integration host factor-binding domain and two UP-like elements enhances transcription from the pilEp(1) promoter of Neisseria gonorrhoeae","volume":"180","author":"Fyfe","year":"1998","journal-title":"J Bacteriol"},{"key":"2022031506262261400_ref149","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1016\/S0167-4781(96)00206-0","article-title":"Transcription initiation from TATA-less promoters within eukaryotic protein-coding genes","volume":"1351","author":"Smale","year":"1997","journal-title":"Biochim Biophys Acta Gene Struct Express"},{"key":"2022031506262261400_ref150","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1016\/S0092-8674(04)00205-3","article-title":"Identification and distinct regulation of yeast TATA box-containing genes","volume":"116","author":"Basehoar","year":"2004","journal-title":"Cell"},{"key":"2022031506262261400_ref151","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1002\/wdev.21","article-title":"Perspectives on the RNA polymerase II core promoter","volume":"1","author":"Kadonaga","year":"2012","journal-title":"Wiley Interdiscip Rev Dev Biol"},{"key":"2022031506262261400_ref152","doi-asserted-by":"crossref","first-page":"2013","DOI":"10.1101\/gad.1951110","article-title":"The TCT motif, a key component of an RNA polymerase II transcription system for the translational machinery","volume":"24","author":"Parry","year":"2010","journal-title":"Genes Dev"},{"key":"2022031506262261400_ref153","doi-asserted-by":"crossref","first-page":"3471","DOI":"10.1128\/MCB.00053-10","article-title":"Three key subregions contribute to the function of the downstream RNA polymerase II core promoter","volume":"30","author":"Theisen","year":"2010","journal-title":"Mol Cell Biol"},{"key":"2022031506262261400_ref154","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1016\/0959-437X(95)80044-1","article-title":"CpG islands and genes","volume":"5","author":"Cross","year":"1995","journal-title":"Curr Opin Genet Dev"},{"key":"2022031506262261400_ref155","article-title":"DeepTorrent: a deep learning-based approach for predicting DNA N4-methylcytosine sites","volume":"22","author":"Liu","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022031506262261400_ref156","doi-asserted-by":"crossref","first-page":"2796","DOI":"10.1093\/bioinformatics\/btz015","article-title":"i6mA-Pred: identifying DNA N-6-methyladenine sites in the rice genome","volume":"35","author":"Chen","year":"2019","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref157","doi-asserted-by":"crossref","first-page":"2185","DOI":"10.1093\/bib\/bby079","article-title":"Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework","volume":"20","author":"Zhang","year":"2019","journal-title":"Brief Bioinform"},{"key":"2022031506262261400_ref158","doi-asserted-by":"crossref","first-page":"W65","DOI":"10.1093\/nar\/gkv458","article-title":"Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences","author":"Liu","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref159","doi-asserted-by":"crossref","first-page":"2499","DOI":"10.1093\/bioinformatics\/bty140","article-title":"iFeature: a python package and web server for features extraction and selection from protein and peptide sequences","volume":"34","author":"Zhen","year":"2018","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref160","article-title":"iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data","volume":"21","author":"Zhen","year":"2020","journal-title":"Brief Bioinform"},{"key":"2022031506262261400_ref161","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkab122","article-title":"iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization","volume":"49","author":"Chen","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref162","article-title":"DNA Structure in Human RNA Polymerase II Promoters","volume-title":"Journal of molecular biology","author":"Pedersen"},{"key":"2022031506262261400_ref163","doi-asserted-by":"crossref","first-page":"2316","DOI":"10.1093\/nar\/gkl230","article-title":"Involvement of DNA curvature in intergenic regions of prokaryotes","volume":"34","author":"Kozobay-Avraham","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref164","doi-asserted-by":"crossref","first-page":"324","DOI":"10.1002\/2211-5463.12166","article-title":"DNA structural features of eukaryotic TATA-containing and TATA-less promoters","volume":"7","author":"Yella","year":"2017","journal-title":"Febs Open Bio"},{"key":"2022031506262261400_ref165","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1007\/978-94-017-9514-2_4","volume-title":"Systems and Synthetic Biology","author":"Yella","year":"2015"},{"key":"2022031506262261400_ref166","doi-asserted-by":"crossref","first-page":"2445","DOI":"10.1016\/j.csbj.2020.09.001","article-title":"ncPro-ML: an integrated computational tool for identifying non-coding RNA promoters in multiple species","volume":"18","author":"Tang","year":"2020","journal-title":"Comput Struct Biotechnol J"},{"key":"2022031506262261400_ref167","doi-asserted-by":"crossref","first-page":"2617","DOI":"10.1016\/j.ymthe.2021.04.004","article-title":"mRNALocater: enhance the prediction accuracy of eukaryotic mRNA subcellular localization by using model fusion strategy","volume":"29","author":"Tang","year":"2021","journal-title":"Mol Ther"},{"key":"2022031506262261400_ref168","first-page":"148","volume-title":"Proceedings of the Thirteenth International Conference (ICML '96)","author":"Freund","year":"1996"},{"key":"2022031506262261400_ref169","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1214\/aos\/1013203451","article-title":"Greedy function approximation: a gradient boosting machine","volume":"29","author":"Friedman","year":"2001","journal-title":"Ann Stat"},{"key":"2022031506262261400_ref170","doi-asserted-by":"crossref","DOI":"10.1145\/2939672.2939785","article-title":"Xgboost: A scalable tree boosting system","author":"Chen","year":"2016","journal-title":"Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining"},{"key":"2022031506262261400_ref171","volume-title":"Advances in Neural Information Processing Systems 30","author":"Ke"},{"key":"2022031506262261400_ref172","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1093\/jamia\/ocz200","article-title":"Deep learning in clinical natural language processing: a methodical review","volume":"27","author":"Wu","year":"2020","journal-title":"J Am Med Inform Assoc"},{"key":"2022031506262261400_ref173","doi-asserted-by":"crossref","DOI":"10.1126\/sciadv.aap7885","article-title":"Deep reinforcement learning for de novo drug design","volume":"4","author":"Popova","year":"2018","journal-title":"Sci Adv"},{"key":"2022031506262261400_ref174","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun ACM"},{"key":"2022031506262261400_ref175","doi-asserted-by":"crossref","first-page":"2673","DOI":"10.1109\/78.650093","article-title":"Bidirectional recurrent neural networks","volume":"45","author":"Schuster","year":"1997","journal-title":"IEEE Trans Signal Process"},{"key":"2022031506262261400_ref176","doi-asserted-by":"crossref","first-page":"4223","DOI":"10.1093\/bioinformatics\/bty522","article-title":"Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref177","doi-asserted-by":"crossref","first-page":"1057","DOI":"10.1093\/bioinformatics\/btz721","article-title":"DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites","volume":"36","author":"Li","year":"2020","journal-title":"Bioinformatics"},{"key":"2022031506262261400_ref178","doi-asserted-by":"crossref","first-page":"274","DOI":"10.1093\/bib\/bbx123","article-title":"Critical assessment and performance improvement of plant-pathogen protein-protein interaction prediction methods","volume":"20","author":"Yang","year":"2019","journal-title":"Brief Bioinform"},{"key":"2022031506262261400_ref179","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","article-title":"Comparison of the predicted and observed secondary structure of T4 phage lysozyme","volume":"405","author":"Matthews","year":"1975","journal-title":"Biochim Biophys Acta"},{"key":"2022031506262261400_ref180","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1007\/978-3-642-01307-2_43","volume-title":"13th Pacific-Asia Conference on Knowledge Discovery and Data Mining","author":"Bunkhumpornpat","year":"2009"},{"key":"2022031506262261400_ref181","doi-asserted-by":"crossref","first-page":"1937","DOI":"10.1109\/TCBB.2019.2957758","article-title":"Formator: predicting lysine formylation sites based on the most distant undersampling and safe-level synthetic minority oversampling","volume":"18","author":"Jia","year":"2021","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2022031506262261400_ref182","doi-asserted-by":"crossref","DOI":"10.1186\/s13059-021-02365-4","article-title":"On the optimistic performance evaluation of newly introduced bioinformatic methods","volume":"22","author":"Buchka","year":"2021","journal-title":"Genome Biol"},{"key":"2022031506262261400_ref183","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1186\/1756-0500-4-257","article-title":"PromBase: a web resource for various genomic features and predicted promoters in prokaryotic genomes","volume":"4","author":"Rangannan","year":"2011","journal-title":"BMC Res Notes"},{"key":"2022031506262261400_ref184","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1093\/nar\/28.1.302","article-title":"The eukaryotic promoter database (EPD)","volume":"28","author":"Perier","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2022031506262261400_ref185","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1093\/nar\/gkg041","article-title":"PlantProm: a database of plant promoter sequences","volume":"31","author":"Shahmuradov","year":"2003","journal-title":"Nucleic Acids Res"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/2\/bbab551\/42805737\/bbab551.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/2\/bbab551\/42805737\/bbab551.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,13]],"date-time":"2023-11-13T12:46:13Z","timestamp":1699879573000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbab551\/6502561"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,11]]},"references-count":185,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,3,10]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbab551","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,3]]},"published":{"date-parts":[[2022,1,11]]},"article-number":"bbab551"}}