{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,29]],"date-time":"2026-05-29T18:37:07Z","timestamp":1780079827775,"version":"3.54.0"},"reference-count":37,"publisher":"Oxford University Press (OUP)","issue":"16","license":[{"start":{"date-parts":[[2019,1,2]],"date-time":"2019-01-02T00:00:00Z","timestamp":1546387200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100004052","name":"King Abdullah University of Science and Technology","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004052","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004052","name":"KAUST","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004052","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Office of Sponsored Research"},{"name":"OSR","award":["FCC\/1\/1976-17-01"],"award-info":[{"award-number":["FCC\/1\/1976-17-01"]}]},{"name":"OSR","award":["FCC\/1\/1976-18-01"],"award-info":[{"award-number":["FCC\/1\/1976-18-01"]}]},{"name":"OSR","award":["FCC\/1\/1976-23-01"],"award-info":[{"award-number":["FCC\/1\/1976-23-01"]}]},{"name":"OSR","award":["FCC\/1\/1976-25-01"],"award-info":[{"award-number":["FCC\/1\/1976-25-01"]}]},{"name":"OSR","award":["FCC\/1\/1976-26-01"],"award-info":[{"award-number":["FCC\/1\/1976-26-01"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,8,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Computational identification of promoters is notoriously difficult as human genes often have unique promoter sequences that provide regulation of transcription and interaction with transcription initiation complex. While there are many attempts to develop computational promoter identification methods, we have no reliable tool to analyze long genomic sequences.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>In this work, we further develop our deep learning approach that was relatively successful to discriminate short promoter and non-promoter sequences. Instead of focusing on the classification accuracy, in this work we predict the exact positions of the transcription start site inside the genomic sequences testing every possible location. We studied human promoters to find effective regions for discrimination and built corresponding deep learning models. These models use adaptively constructed negative set, which iteratively improves the model\u2019s discriminative ability. Our method significantly outperforms the previously developed promoter prediction programs by considerably reducing the number of false-positive predictions. We have achieved error-per-1000-bp rate of 0.02 and have 0.31 errors per correct prediction, which is significantly better than the results of other human promoter predictors.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The developed method is available as a web server at http:\/\/www.cbrc.kaust.edu.sa\/PromID\/.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty1068","type":"journal-article","created":{"date-parts":[[2018,12,27]],"date-time":"2018-12-27T20:11:26Z","timestamp":1545941486000},"page":"2730-2737","source":"Crossref","is-referenced-by-count":102,"title":["Promoter analysis and prediction in the human genome using sequence-based deep learning models"],"prefix":"10.1093","volume":"35","author":[{"given":"Ramzan","family":"Umarov","sequence":"first","affiliation":[{"name":"Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hiroyuki","family":"Kuwahara","sequence":"additional","affiliation":[{"name":"Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3664-6722","authenticated-orcid":false,"given":"Yu","family":"Li","sequence":"additional","affiliation":[{"name":"Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xin","family":"Gao","sequence":"additional","affiliation":[{"name":"Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Victor","family":"Solovyev","sequence":"additional","affiliation":[{"name":"Department of Cell Biology, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2019,1,2]]},"reference":[{"key":"2023062708583631900_bty1068-B1","first-page":"265","author":"Abadi","year":"2016"},{"key":"2023062708583631900_bty1068-B2","doi-asserted-by":"crossref","first-page":"baw093","DOI":"10.1093\/database\/baw093","article-title":"The Ensembl gene annotation system","volume":"2016","author":"Aken","year":"2016","journal-title":"Database"},{"key":"2023062708583631900_bty1068-B3","doi-asserted-by":"crossref","first-page":"136","DOI":"10.1038\/nbt.3739","article-title":"Genome-wide assessment of sequence-intrinsic enhancer responsiveness at single-base-pair resolution","volume":"35","author":"Arnold","year":"2017","journal-title":"Nat. Biotechnol"},{"key":"2023062708583631900_bty1068-B4","doi-asserted-by":"crossref","first-page":"1923","DOI":"10.1101\/gr.869803","article-title":"Dragon gene start finder: an advanced system for finding approximate locations of the start of gene transcriptional units","volume":"13","author":"Bajic","year":"2003","journal-title":"Genome Res"},{"key":"2023062708583631900_bty1068-B5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2006-7-s1-s3","article-title":"Performance assessment of promoter predictions on ENCODE regions in the EGASP experiment","volume":"7","author":"Bajic","year":"2006","journal-title":"Genome Biol"},{"key":"2023062708583631900_bty1068-B6","doi-asserted-by":"crossref","first-page":"2583","DOI":"10.1101\/gad.1026202","article-title":"The RNA polymerase II core promoter: a key component in the regulation of gene expression","volume":"16","author":"Butler","year":"2002","journal-title":"Genes Dev"},{"key":"2023062708583631900_bty1068-B7","doi-asserted-by":"crossref","first-page":"1116","DOI":"10.1016\/j.bbagrm.2015.04.003","article-title":"The core promoter: at the heart of gene expression","volume":"1849","author":"Danino","year":"2015","journal-title":"Biochim. Biophys. Acta"},{"key":"2023062708583631900_bty1068-B8","doi-asserted-by":"crossref","first-page":"D157","DOI":"10.1093\/nar\/gks1233","article-title":"EPD and EPDnew, high-quality promoter resources in the next-generation sequencing era","volume":"41","author":"Dreos","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023062708583631900_bty1068-B9","doi-asserted-by":"crossref","first-page":"D51","DOI":"10.1093\/nar\/gkw1069","article-title":"The eukaryotic promoter database in its 30th year: focus on non-vertebrate organisms","volume":"45","author":"Dreos","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023062708583631900_bty1068-B10","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1016\/j.ajhg.2013.10.012","article-title":"Beyond GWASs: illuminating the dark road from association to function","volume":"93","author":"Edwards","year":"2013","journal-title":"Am. J. Hum. Genet"},{"key":"2023062708583631900_bty1068-B11","doi-asserted-by":"crossref","first-page":"2399","DOI":"10.1101\/gr.138776.112","article-title":"CpG islands and GC content dictate nucleosome depletion in a transcription-independent manner at mammalian promoters","volume":"22","author":"Fenouil","year":"2012","journal-title":"Genome Res"},{"key":"2023062708583631900_bty1068-B12","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1101\/gr.7.9.861","article-title":"Eukaryotic promoter recognition","volume":"7","author":"Fickett","year":"1997","journal-title":"Genome Res"},{"key":"2023062708583631900_bty1068-B13","doi-asserted-by":"crossref","first-page":"e1006773.","DOI":"10.1371\/journal.pgen.1006773","article-title":"Recurrent promoter mutations in melanoma are defined by an extended context-specific mutational signature","volume":"13","author":"Fredriksson","year":"2017","journal-title":"PLoS Genet"},{"key":"2023062708583631900_bty1068-B14","doi-asserted-by":"crossref","first-page":"1358.","DOI":"10.1038\/s41467-017-01467-7","article-title":"The effect of genetic variation on promoter usage and enhancer activity","volume":"8","author":"Garieri","year":"2017","journal-title":"Nat. Commun"},{"key":"2023062708583631900_bty1068-B16","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1016\/j.ceb.2008.03.003","article-title":"The RNA polymerase II core promoter\u2014the gateway to transcription","volume":"20","author":"Juven-Gershon","year":"2008","journal-title":"Curr. Opin. Cell Biol"},{"key":"2023062708583631900_bty1068-B17","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1002\/wdev.21","article-title":"Perspectives on the RNA polymerase II core promoter","volume":"1","author":"Kadonaga","year":"2012","journal-title":"Wiley Interdiscip. Rev. Dev. Biol"},{"key":"2023062708583631900_bty1068-B18","first-page":"6980","article-title":"Adam: a method for stochastic optimization","volume":"1412","author":"Kingma","year":"2014","journal-title":"arXiv"},{"key":"2023062708583631900_bty1068-B19","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1093\/bioinformatics\/15.5.356","article-title":"Promoter2. 0: for the recognition of polII promoter sequences","volume":"15","author":"Knudsen","year":"1999","journal-title":"Bioinformatics"},{"key":"2023062708583631900_bty1068-B20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1101\/gad.295980.117","article-title":"Finding the start site: redefining the human initiator element","volume":"31","author":"Kugel","year":"2017","journal-title":"Genes Dev"},{"key":"2023062708583631900_bty1068-B21","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1038\/nrg3163","article-title":"Metazoan promoters: emerging characteristics and insights into transcriptional regulation","volume":"13","author":"Lenhard","year":"2012","journal-title":"Nat. Rev. Genet"},{"key":"2023062708583631900_bty1068-B22","volume-title":"Molecular Cell Biology","author":"Lodish","year":"2000","edition":"4"},{"key":"2023062708583631900_bty1068-B23","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1146\/annurev.genom.7.080505.115623","article-title":"Transcriptional regulatory elements in the human genome","volume":"7","author":"Maston","year":"2006","journal-title":"Annu. Rev. Genomics Hum. Genet"},{"key":"2023062708583631900_bty1068-B24","doi-asserted-by":"crossref","first-page":"1739","DOI":"10.1534\/genetics.104.026955","article-title":"Enhancer choice in cis and in trans in Drosophila melanogaster: role of the promoter","volume":"167","author":"Morris","year":"2004","journal-title":"Genetics"},{"key":"2023062708583631900_bty1068-B25","first-page":"471","author":"Qian","year":"2018"},{"key":"2023062708583631900_bty1068-B26","article-title":"Regulatory variants: from detection to predicting impact","author":"Rojano","year":"2018","journal-title":"Brief. Bioinform"},{"key":"2023062708583631900_bty1068-B27","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1016\/j.tibs.2015.01.007","article-title":"Core promoters in transcription: old problem, new insights","volume":"40","author":"Roy","year":"2015","journal-title":"Trends Biochem. Sci"},{"key":"2023062708583631900_bty1068-B28","first-page":"294","author":"Salamov","year":"1997"},{"key":"2023062708583631900_bty1068-B29","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1146\/annurev.biochem.72.121801.161520","article-title":"The RNA polymerase II core promoter","volume":"72","author":"Smale","year":"2003","journal-title":"Annu. Rev. Biochem"},{"key":"2023062708583631900_bty1068-B30","doi-asserted-by":"crossref","first-page":"3540","DOI":"10.1093\/nar\/gkg525","article-title":"PromH: promoters identification using orthologous genomic sequences","volume":"31","author":"Solovyev","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023062708583631900_bty1068-B31","first-page":"1929","article-title":"Dropout: a simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res"},{"key":"2023062708583631900_bty1068-B32","doi-asserted-by":"crossref","first-page":"e30.","DOI":"10.1371\/journal.pgen.0020030","article-title":"Heterotachy in mammalian promoter evolution","volume":"2","author":"Taylor","year":"2006","journal-title":"PLoS Genet"},{"key":"2023062708583631900_bty1068-B33","doi-asserted-by":"crossref","first-page":"e0171410.","DOI":"10.1371\/journal.pone.0171410","article-title":"Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks","volume":"12","author":"Umarov","year":"2017","journal-title":"PLoS One"},{"key":"2023062708583631900_bty1068-B34","doi-asserted-by":"crossref","first-page":"2185.","DOI":"10.1038\/ncomms3185","article-title":"Frequency of TERT promoter mutations in human cancers","volume":"4","author":"Vinagre","year":"2013","journal-title":"Nat. Commun"},{"key":"2023062708583631900_bty1068-B35","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1101\/gad.293837.116","article-title":"The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters","volume":"31","author":"Vo Ngoc","year":"2017","journal-title":"Genes Dev"},{"key":"2023062708583631900_bty1068-B36","doi-asserted-by":"crossref","first-page":"1289","DOI":"10.1101\/gad.303149.117","article-title":"The punctilious RNA polymerase II core promoter","volume":"31","author":"Vo Ngoc","year":"2017","journal-title":"Genes Dev"},{"key":"2023062708583631900_bty1068-B37","doi-asserted-by":"crossref","first-page":"324","DOI":"10.1002\/2211-5463.12166","article-title":"DNA structural features of eukaryotic TATA-containing and TATA-less promoters","volume":"7","author":"Yella","year":"2017","journal-title":"FEBS Open Bio"},{"key":"2023062708583631900_bty1068-B38","doi-asserted-by":"crossref","first-page":"556","DOI":"10.1038\/nature13994","article-title":"Enhancer-core-promoter specificity separates developmental and housekeeping gene regulation","volume":"518","author":"Zabidi","year":"2015","journal-title":"Nature"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/16\/2730\/50719142\/bty1068.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/16\/2730\/50719142\/bty1068.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T09:05:32Z","timestamp":1687856732000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/16\/2730\/5270663"}},"subtitle":[],"editor":[{"given":"John","family":"Hancock","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2019,1,2]]},"references-count":37,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2019,8,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty1068","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,8,15]]},"published":{"date-parts":[[2019,1,2]]}}}