{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T06:48:51Z","timestamp":1775112531059,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1009376","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,9,17]],"date-time":"2021-09-17T00:00:00Z","timestamp":1631836800000}}],"reference-count":48,"publisher":"Public Library of Science (PLoS)","issue":"9","license":[{"start":{"date-parts":[[2021,9,7]],"date-time":"2021-09-07T00:00:00Z","timestamp":1630972800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Regulatory elements control gene expression through transcription initiation (promoters) and by enhancing transcription at distant regions (enhancers). Accurate identification of regulatory elements is fundamental for annotating genomes and understanding gene expression patterns. While there are many attempts to develop computational promoter and enhancer identification methods, reliable tools to analyze long genomic sequences are still lacking. Prediction methods often perform poorly on the genome-wide scale because the number of negatives is much higher than that in the training sets. To address this issue, we propose a dynamic negative set updating scheme with a two-model approach, using one model for scanning the genome and the other one for testing candidate positions. The developed method achieves good genome-level performance and maintains robust performance when applied to other vertebrate species, without re-training. Moreover, the unannotated predicted regulatory regions made on the human genome are enriched for disease-associated variants, suggesting them to be potentially true regulatory elements rather than false positives. We validated high scoring \u201cfalse positive\u201d predictions using reporter assay and all tested candidates were successfully validated, demonstrating the ability of our method to discover novel human regulatory regions.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1009376","type":"journal-article","created":{"date-parts":[[2021,9,7]],"date-time":"2021-09-07T14:05:00Z","timestamp":1631023500000},"page":"e1009376","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":10,"title":["ReFeaFi: Genome-wide prediction of regulatory elements driving transcription initiation"],"prefix":"10.1371","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3477-7101","authenticated-orcid":true,"given":"Ramzan","family":"Umarov","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3664-6722","authenticated-orcid":true,"given":"Yu","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Takahiro","family":"Arakawa","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4317-2158","authenticated-orcid":true,"given":"Satoshi","family":"Takizawa","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7108-3574","authenticated-orcid":true,"given":"Xin","family":"Gao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1225-4908","authenticated-orcid":true,"given":"Erik","family":"Arner","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"340","published-online":{"date-parts":[[2021,9,7]]},"reference":[{"issue":"4","key":"pcbi.1009376.ref001","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1038\/nrg3163","article-title":"Metazoan promoters: emerging characteristics and insights into transcriptional regulation","volume":"13","author":"B Lenhard","year":"2012","journal-title":"Nat Rev Genet"},{"issue":"3","key":"pcbi.1009376.ref002","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1016\/j.tibs.2015.01.007","article-title":"Core promoters in transcription: old problem, new insights","volume":"40","author":"AL Roy","year":"2015","journal-title":"Trends Biochem Sci"},{"issue":"8","key":"pcbi.1009376.ref003","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1038\/s41576-019-0128-0","article-title":"Long-range enhancer-promoter contacts in gene expression control","volume":"20","author":"S Schoenfelder","year":"2019","journal-title":"Nat Rev Genet"},{"key":"pcbi.1009376.ref004","doi-asserted-by":"crossref","first-page":"5336","DOI":"10.1038\/ncomms6336","article-title":"Nuclear stability and transcriptional directionality separate functionally distinct RNA species","volume":"5","author":"R Andersson","year":"2014","journal-title":"Nat Commun"},{"issue":"12","key":"pcbi.1009376.ref005","doi-asserted-by":"crossref","first-page":"1311","DOI":"10.1038\/ng.3142","article-title":"Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers","volume":"46","author":"LJ Core","year":"2014","journal-title":"Nat Genet"},{"issue":"7629","key":"pcbi.1009376.ref006","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1038\/nature20149","article-title":"Local regulation of gene expression by lncRNA promoters, transcription and splicing","volume":"539","author":"JM Engreitz","year":"2016","journal-title":"Nature"},{"issue":"7295","key":"pcbi.1009376.ref007","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1038\/nature09033","article-title":"Widespread transcription at neuronal activity-regulated enhancers","volume":"465","author":"T-K Kim","year":"2010","journal-title":"Nature"},{"issue":"18","key":"pcbi.1009376.ref008","doi-asserted-by":"crossref","first-page":"2847","DOI":"10.4161\/15384101.2014.949201","article-title":"Role of ChIP-seq in the discovery of transcription factor binding sites, differential gene regulation mechanism, epigenetic marks and beyond","volume":"13","author":"R Mundade","year":"2014","journal-title":"Cell Cycle Georget Tex"},{"issue":"2","key":"pcbi.1009376.ref009","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1002\/wdev.168","article-title":"Identifying transcriptional cis-regulatory modules in animal genomes","volume":"4","author":"K Suryamohan","year":"2015","journal-title":"Wiley Interdiscip Rev Dev Biol"},{"key":"pcbi.1009376.ref010","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1016\/j.csbj.2016.06.004","article-title":"Dry and wet approaches for genome-wide functional annotation of conventional and unconventional transcriptional activators","volume":"14","author":"E Levati","year":"2016","journal-title":"Comput Struct Biotechnol J"},{"issue":"2","key":"pcbi.1009376.ref011","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1101\/gr.6991408","article-title":"Generic eukaryotic core promoter prediction using structural features of DNA","volume":"18","author":"T Abeel","year":"2008","journal-title":"Genome Res"},{"issue":"7","key":"pcbi.1009376.ref012","doi-asserted-by":"crossref","first-page":"1125","DOI":"10.1093\/bioinformatics\/bty752","article-title":"DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions","volume":"35","author":"M Kalkatawi","year":"2019","journal-title":"Bioinforma Oxf Engl"},{"issue":"13","key":"pcbi.1009376.ref013","doi-asserted-by":"crossref","first-page":"1930","DOI":"10.1093\/bioinformatics\/btx105","article-title":"BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone","volume":"33","author":"B Yang","year":"2017","journal-title":"Bioinforma Oxf Engl"},{"issue":"1","key":"pcbi.1009376.ref014","doi-asserted-by":"crossref","first-page":"e6","DOI":"10.1093\/nar\/gku1058","article-title":"DEEP: a general computational framework for predicting enhancers","volume":"43","author":"D Kleftogiannis","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1009376.ref015","doi-asserted-by":"crossref","first-page":"e278","DOI":"10.7717\/peerj-cs.278","article-title":"Genome annotation across species using deep convolutional neural networks","volume":"6","author":"G Khodabandelou","year":"2020","journal-title":"PeerJ Comput Sci"},{"issue":"1","key":"pcbi.1009376.ref016","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1186\/s13059-019-1860-7","article-title":"CRUP: a comprehensive framework to predict condition-specific regulatory units","volume":"20","author":"A Ramisch","year":"2019","journal-title":"Genome Biol"},{"issue":"7","key":"pcbi.1009376.ref017","doi-asserted-by":"crossref","first-page":"2926","DOI":"10.1073\/pnas.0909344107","article-title":"Histone modification levels are predictive for gene expression","volume":"107","author":"R Karli\u0107","year":"2010","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"9","key":"pcbi.1009376.ref018","doi-asserted-by":"crossref","first-page":"E1633","DOI":"10.1073\/pnas.1618353114","article-title":"Improved regulatory element prediction based on tissue-specific local epigenomic signatures","volume":"114","author":"Y He","year":"2017","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"10","key":"pcbi.1009376.ref019","doi-asserted-by":"crossref","first-page":"e77","DOI":"10.1093\/nar\/gks149","article-title":"Genome-wide enhancer prediction from epigenetic signatures using genetic algorithm-optimized support vector machines","volume":"40","author":"M Fern\u00e1ndez","year":"2012","journal-title":"Nucleic Acids Res"},{"issue":"8","key":"pcbi.1009376.ref020","doi-asserted-by":"crossref","first-page":"807","DOI":"10.1038\/s41592-020-0907-8","article-title":"Supervised enhancer prediction with epigenetic pattern recognition and targeted validation","volume":"17","author":"A Sethi","year":"2020","journal-title":"Nat Methods."},{"issue":"1","key":"pcbi.1009376.ref021","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1186\/s13059-020-02220-y","article-title":"MethylationToActivity: a deep-learning framework that reveals promoter activity landscapes from DNA methylomes in individual tumors","volume":"22","author":"J Williams","year":"2021","journal-title":"Genome Biol"},{"key":"pcbi.1009376.ref022","doi-asserted-by":"crossref","first-page":"38433","DOI":"10.1038\/srep38433","article-title":"EP-DNN: A Deep Neural Network-Based Global Enhancer Prediction Algorithm","volume":"6","author":"SG Kim","year":"2016","journal-title":"Sci Rep"},{"issue":"3","key":"pcbi.1009376.ref023","doi-asserted-by":"crossref","first-page":"542","DOI":"10.1038\/nprot.2012.005","article-title":"5\u2019 end-centered expression profiling using cap-analysis gene expression and next-generation sequencing","volume":"7","author":"H Takahashi","year":"2012","journal-title":"Nat Protoc"},{"issue":"D1","key":"pcbi.1009376.ref024","doi-asserted-by":"crossref","first-page":"D1005","DOI":"10.1093\/nar\/gky1120","article-title":"The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019","volume":"47","author":"A Buniello","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"pcbi.1009376.ref025","doi-asserted-by":"crossref","first-page":"D1062","DOI":"10.1093\/nar\/gkx1153","article-title":"ClinVar: improving access to variant interpretations and supporting evidence","volume":"46","author":"MJ Landrum","year":"2018","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"pcbi.1009376.ref026","doi-asserted-by":"crossref","first-page":"D752","DOI":"10.1093\/nar\/gky1099","article-title":"Update of the FANTOM web resource: expansion to provide additional transcriptome atlases","volume":"47","author":"M Lizio","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"7","key":"pcbi.1009376.ref027","doi-asserted-by":"crossref","first-page":"990","DOI":"10.1101\/gr.200535.115","article-title":"Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks","volume":"26","author":"DR Kelley","year":"2016","journal-title":"Genome Res"},{"issue":"1","key":"pcbi.1009376.ref028","doi-asserted-by":"crossref","first-page":"4520","DOI":"10.1038\/s41598-018-22129-8","article-title":"Identification of putative promoters in 48 eukaryotic genomes on the basis of DNA free energy","volume":"8","author":"VR Yella","year":"2018","journal-title":"Sci Rep"},{"issue":"4","key":"pcbi.1009376.ref029","doi-asserted-by":"crossref","first-page":"R33","DOI":"10.1186\/gb-2005-6-4-r33","article-title":"Promoter features related to tissue specificity as measured by Shannon entropy","volume":"6","author":"J Schug","year":"2005","journal-title":"Genome Biol"},{"key":"pcbi.1009376.ref030","doi-asserted-by":"crossref","first-page":"D88","DOI":"10.1093\/nar\/gkl822","article-title":"VISTA Enhancer Browser\u2014a database of tissue-specific human enhancers","volume":"35","author":"A Visel","year":"2007","journal-title":"Nucleic Acids Res"},{"issue":"12","key":"pcbi.1009376.ref031","doi-asserted-by":"crossref","first-page":"2167","DOI":"10.1101\/gr.121905.111","article-title":"Discriminative prediction of mammalian enhancers from DNA sequence","volume":"21","author":"D Lee","year":"2011","journal-title":"Genome Res"},{"issue":"2","key":"pcbi.1009376.ref032","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1101\/gr.236075.118","article-title":"Systematic interrogation of human promoters","volume":"29","author":"S Weingarten-Gabbay","year":"2019","journal-title":"Genome Res"},{"key":"pcbi.1009376.ref033","doi-asserted-by":"crossref","first-page":"D1001","DOI":"10.1093\/nar\/gkt1229","article-title":"The NHGRI GWAS Catalog, a curated resource of SNP-trait associations","volume":"42","author":"D Welter","year":"2014","journal-title":"Nucleic Acids Res"},{"issue":"7539","key":"pcbi.1009376.ref034","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1038\/nature14248","article-title":"Integrative analysis of 111 reference human epigenomes","volume":"518","author":"Roadmap Epigenomics Consortium","year":"2015","journal-title":"Nature"},{"issue":"3","key":"pcbi.1009376.ref035","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1080\/21541264.2015.1067286","article-title":"ElemeNT: a computational tool for detecting core promoter elements","volume":"6","author":"A Sloutskin","year":"2015","journal-title":"Transcription"},{"key":"pcbi.1009376.ref036","doi-asserted-by":"crossref","first-page":"gkz1001","DOI":"10.1093\/nar\/gkz1001","article-title":"JASPAR 2020: update of the open-access database of transcription factor binding profiles","author":"O Fornes","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"5628","key":"pcbi.1009376.ref037","doi-asserted-by":"crossref","first-page":"2097","DOI":"10.1126\/science.1084648","article-title":"Comprehensive identification of human bZIP interactions with coiled-coil arrays","volume":"300","author":"JRS Newman","year":"2003","journal-title":"Science"},{"issue":"5","key":"pcbi.1009376.ref038","doi-asserted-by":"crossref","first-page":"744","DOI":"10.1016\/j.cell.2010.01.044","article-title":"An Atlas of Combinatorial Transcriptional Regulation in Mouse and Man","volume":"140","author":"T Ravasi","year":"2010","journal-title":"Cell"},{"issue":"7","key":"pcbi.1009376.ref039","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1093\/bioinformatics\/btr064","article-title":"FIMO: scanning for occurrences of a given motif.","volume":"27","author":"CE Grant","year":"2011","journal-title":"Bioinforma Oxf Engl"},{"issue":"1","key":"pcbi.1009376.ref040","doi-asserted-by":"crossref","first-page":"3488","DOI":"10.1038\/s41467-020-17155-y","article-title":"Deep learning for genomics using Janggu","volume":"11","author":"W Kopp","year":"2020","journal-title":"Nat Commun"},{"issue":"4","key":"pcbi.1009376.ref041","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1007\/s40484-013-0022-2","article-title":"NPEST: a nonparametric method and a database for transcription start site prediction.","volume":"1","author":"T Tatarinova","year":"2013","journal-title":"Quant Biol Beijing China"},{"issue":"16","key":"pcbi.1009376.ref042","doi-asserted-by":"crossref","first-page":"2730","DOI":"10.1093\/bioinformatics\/bty1068","article-title":"Promoter analysis and prediction in the human genome using sequence-based deep learning models","volume":"35","author":"R Umarov","year":"2019","journal-title":"Bioinforma Oxf Engl"},{"issue":"2","key":"pcbi.1009376.ref043","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1038\/s41576-019-0173-8","article-title":"Determinants of enhancer and promoter activities of regulatory elements","volume":"21","author":"R Andersson","year":"2020","journal-title":"Nat Rev Genet"},{"key":"pcbi.1009376.ref044","article-title":"Deep Residual Learning for Image Recognition","author":"K He","year":"2015","journal-title":"ArXiv151203385 Cs"},{"key":"pcbi.1009376.ref045","article-title":"Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift","author":"S Ioffe","year":"2015","journal-title":"ArXiv150203167 Cs"},{"key":"pcbi.1009376.ref046","first-page":"3","volume-title":"Proc icml","author":"AL Maas","year":"2013"},{"key":"pcbi.1009376.ref047","unstructured":"Kingma DP, Ba J. Adam: A method for stochastic optimization. ArXiv Prepr ArXiv14126980. 2014;"},{"key":"pcbi.1009376.ref048","unstructured":"Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, et al. Tensorflow: A system for large-scale machine learning. In: 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16). 2016. p. 265\u201383."}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1009376","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,9,17]],"date-time":"2021-09-17T00:00:00Z","timestamp":1631836800000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009376","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,17]],"date-time":"2021-09-17T13:44:32Z","timestamp":1631886272000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009376"}},"subtitle":[],"editor":[{"given":"Andrey","family":"Rzhetsky","sequence":"first","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2021,9,7]]},"references-count":48,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2021,9,7]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1009376","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.03.31.437992","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,7]]}}}