{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T06:22:00Z","timestamp":1772173320247,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1011824","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T00:00:00Z","timestamp":1706745600000}}],"reference-count":31,"publisher":"Public Library of Science (PLoS)","issue":"1","license":[{"start":{"date-parts":[[2024,1,22]],"date-time":"2024-01-22T00:00:00Z","timestamp":1705881600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100009708","name":"Novo Nordisk Fonden","doi-asserted-by":"publisher","award":["NNF10CC1016517"],"award-info":[{"award-number":["NNF10CC1016517"]}],"id":[{"id":"10.13039\/501100009708","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100009708","name":"Novo Nordisk Fonden","doi-asserted-by":"publisher","award":["NNF20CC0035580"],"award-info":[{"award-number":["NNF20CC0035580"]}],"id":[{"id":"10.13039\/501100009708","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>\n                    The transcriptional regulatory network (TRN) of\n                    <jats:italic>E<\/jats:italic>\n                    .\n                    <jats:italic>coli<\/jats:italic>\n                    consists of thousands of interactions between regulators and DNA sequences. Regulons are typically determined either from resource-intensive experimental measurement of functional binding sites, or inferred from analysis of high-throughput gene expression datasets. Recently, independent component analysis (ICA) of RNA-seq compendia has shown to be a powerful method for inferring bacterial regulons. However, it remains unclear to what extent regulons predicted by ICA structure have a biochemical basis in promoter sequences. Here, we address this question by developing machine learning models that predict inferred regulon structures in\n                    <jats:italic>E<\/jats:italic>\n                    .\n                    <jats:italic>coli<\/jats:italic>\n                    based on promoter sequence features. Models were constructed successfully (cross-validation AUROC &gt; = 0.8) for 85% (40\/47) of ICA-inferred\n                    <jats:italic>E<\/jats:italic>\n                    .\n                    <jats:italic>coli<\/jats:italic>\n                    regulons. We found that: 1) The presence of a high scoring regulator motif in the promoter region was sufficient to specify regulatory activity in 40% (19\/47) of the regulons, 2) Additional features, such as DNA shape and extended motifs that can account for regulator multimeric binding, helped to specify regulon structure for the remaining 60% of regulons (28\/47); 3) investigating regulons where initial machine learning models failed revealed new regulator-specific sequence features that improved model accuracy. Finally, we found that strong regulatory binding sequences underlie both the genes shared between ICA-inferred and experimental regulons as well as genes in the\n                    <jats:italic>E<\/jats:italic>\n                    .\n                    <jats:italic>coli<\/jats:italic>\n                    core pan-regulon of Fur. This work demonstrates that the structure of ICA-inferred regulons largely can be understood through the strength of regulator binding sites in promoter regions, reinforcing the utility of top-down inference for regulon discovery.\n                  <\/jats:p>","DOI":"10.1371\/journal.pcbi.1011824","type":"journal-article","created":{"date-parts":[[2024,1,22]],"date-time":"2024-01-22T13:28:37Z","timestamp":1705930117000},"page":"e1011824","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":2,"title":["Inferred regulons are consistent with regulator binding sequences in E. coli"],"prefix":"10.1371","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1936-1223","authenticated-orcid":true,"given":"Sizhe","family":"Qiu","sequence":"first","affiliation":[]},{"given":"Xinlong","family":"Wan","sequence":"additional","affiliation":[]},{"given":"Yueshan","family":"Liang","sequence":"additional","affiliation":[]},{"given":"Cameron R.","family":"Lamoureux","sequence":"additional","affiliation":[]},{"given":"Amir","family":"Akbari","sequence":"additional","affiliation":[]},{"given":"Bernhard O.","family":"Palsson","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6452-483X","authenticated-orcid":true,"given":"Daniel C.","family":"Zielinski","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2024,1,22]]},"reference":[{"issue":"1","key":"pcbi.1011824.ref001","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nrmicro787","article-title":"The regulation of bacterial transcription initiation","volume":"2","author":"DF Browning","year":"2004","journal-title":"Nat Rev Microbiol"},{"issue":"1","key":"pcbi.1011824.ref002","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1016\/j.jmb.2008.05.054","article-title":"Functional organisation of Escherichia coli transcriptional regulatory network","volume":"381","author":"A Mart\u00ednez-Antonio","year":"2008","journal-title":"J Mol Biol"},{"issue":"2","key":"pcbi.1011824.ref003","doi-asserted-by":"crossref","first-page":"e6","DOI":"10.1093\/nar\/gkq1071","article-title":"Use of structural DNA properties for the prediction of transcription-factor binding sites in Escherichia coli","volume":"39","author":"P Meysman","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1011824.ref004","doi-asserted-by":"crossref","first-page":"e55308","DOI":"10.7554\/eLife.55308","article-title":"Deciphering the regulatory genome of Escherichia coli, one hundred promoters at a time.","volume":"9","author":"WT Ireland","year":"2020","journal-title":"Elife"},{"key":"pcbi.1011824.ref005","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1016\/j.ymeth.2015.05.022","article-title":"Defining bacterial regulons using ChIP-seq.","volume":"86","author":"KS Myers","year":"2015","journal-title":"Methods"},{"key":"pcbi.1011824.ref006","doi-asserted-by":"crossref","first-page":"e1004264","DOI":"10.1371\/journal.pgen.1004264","article-title":"Determining the Control Circuitry of Redox Metabolism at the Genome-Scale","author":"S Federowicz","year":"2014","journal-title":"10, PLoS Genetics."},{"issue":"5","key":"pcbi.1011824.ref007","doi-asserted-by":"crossref","first-page":"e0197272","DOI":"10.1371\/journal.pone.0197272","article-title":"ChIP-exo interrogation of Crp, DNA, and RNAP holoenzyme interactions.","volume":"13","author":"H Latif","year":"2018","journal-title":"PLoS One."},{"issue":"49","key":"pcbi.1011824.ref008","doi-asserted-by":"crossref","first-page":"19462","DOI":"10.1073\/pnas.0807227105","article-title":"Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli","volume":"105","author":"BK Cho","year":"2008","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"10","key":"pcbi.1011824.ref009","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1038\/nrmicro2419","article-title":"Advantages and limitations of current network inference methods","volume":"8","author":"R De Smet","year":"2010","journal-title":"Nat Rev Microbiol"},{"issue":"1","key":"pcbi.1011824.ref010","doi-asserted-by":"crossref","first-page":"5536","DOI":"10.1038\/s41467-019-13483-w","article-title":"The Escherichia coli transcriptome mostly consists of independently regulated modules.","volume":"10","author":"AV Sastry","year":"2019","journal-title":"Nat Commun."},{"key":"pcbi.1011824.ref011","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1016\/S0893-6080(00)00026-5","article-title":"Independent component analysis: algorithms and applications","author":"A Hyv\u00e4rinen","year":"2000","journal-title":"13, Neural Networks."},{"issue":"1","key":"pcbi.1011824.ref012","doi-asserted-by":"crossref","first-page":"6338","DOI":"10.1038\/s41467-020-20153-9","article-title":"Machine learning uncovers independently regulated modules in the Bacillus subtilis transcriptome.","volume":"11","author":"K Rychel","year":"2020","journal-title":"Nat Commun."},{"key":"pcbi.1011824.ref013","first-page":"2021","article-title":"A multi-scale transcriptional regulatory network knowledge base for Escherichia coli","author":"CR Lamoureux","year":"2022","journal-title":"bioRxiv"},{"key":"pcbi.1011824.ref014","unstructured":"Lundberg, Lee. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst [Internet]. Available from: https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/8a20a8621978632d76c43dfd28b67767-Abstract.html"},{"issue":"6","key":"pcbi.1011824.ref015","doi-asserted-by":"crossref","first-page":"3079","DOI":"10.1093\/nar\/gkv150","article-title":"The architecture of ArgR-DNA complexes at the genome-scale in Escherichia coli","volume":"43","author":"S Cho","year":"2015","journal-title":"Nucleic Acids Res"},{"issue":"10","key":"pcbi.1011824.ref016","doi-asserted-by":"crossref","first-page":"e1003839","DOI":"10.1371\/journal.pgen.1003839","article-title":"The bacterial response regulator ArcA uses a diverse binding site architecture to regulate carbon oxidation globally.","volume":"9","author":"DM Park","year":"2013","journal-title":"PLoS Genet."},{"issue":"3","key":"pcbi.1011824.ref017","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1016\/j.cels.2016.07.001","article-title":"DNA Shape Features Improve Transcription Factor Binding Site Predictions In Vivo.","volume":"3","author":"A Mathelier","year":"2016","journal-title":"Cell Syst"},{"issue":"10","key":"pcbi.1011824.ref018","doi-asserted-by":"crossref","first-page":"1196","DOI":"10.1038\/s41592-021-01252-x","article-title":"Effective gene expression prediction from sequence by integrating long-range interactions","volume":"18","author":"\u017d Avsec","year":"2021","journal-title":"Nat Methods"},{"issue":"7901","key":"pcbi.1011824.ref019","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1038\/s41586-022-04506-6","article-title":"The evolution, evolvability and engineering of gene regulatory DNA","volume":"603","author":"ED Vaishnav","year":"2022","journal-title":"Nature"},{"key":"pcbi.1011824.ref020","first-page":"2023","article-title":"Reconstructing the Transcriptional Regulatory Network of Probiotic L. reuteri is Enabled by Transcriptomics and Machine Learning","author":"J Josephs-Spaulding","year":"2023","journal-title":"bioRxiv"},{"key":"pcbi.1011824.ref021","doi-asserted-by":"crossref","first-page":"D212","DOI":"10.1093\/nar\/gky1077","article-title":"RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation inE.","volume":"47","author":"A Santos-Zavaleta","year":"2019","journal-title":"Nucleic Acids Research"},{"issue":"18","key":"pcbi.1011824.ref022","doi-asserted-by":"crossref","first-page":"10157","DOI":"10.1093\/nar\/gkaa774","article-title":"The Bitome: digitized genomic features reveal fundamental genome organization","volume":"48","author":"CR Lamoureux","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1011824.ref023","first-page":"365","volume-title":"Transcription Regulation in Prokaryotes","author":"R. Wagner","year":"2000"},{"issue":"9","key":"pcbi.1011824.ref024","doi-asserted-by":"crossref","first-page":"4891","DOI":"10.1093\/nar\/gkaa244","article-title":"A non-canonical promoter element drives spurious transcription of horizontally acquired bacterial genes","volume":"48","author":"EA Warman","year":"2020","journal-title":"Nucleic Acids Res"},{"issue":"7","key":"pcbi.1011824.ref025","doi-asserted-by":"crossref","first-page":"2194","DOI":"10.1128\/JB.185.7.2194-2202.2003","article-title":"Architecture of a fur binding site: a comparative analysis","volume":"185","author":"JL Lavrrar","year":"2003","journal-title":"J Bacteriol"},{"key":"pcbi.1011824.ref026","doi-asserted-by":"crossref","first-page":"W202","DOI":"10.1093\/nar\/gkp335","article-title":"MEME SUITE: tools for motif discovery and searching","author":"TL Bailey","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1011824.ref027","doi-asserted-by":"crossref","first-page":"1211","DOI":"10.1093\/bioinformatics\/btv735","article-title":"DNAshapeR: an R\/Bioconductor package for DNA shape prediction and feature encoding [Internet].","volume":"32","author":"TP Chiu","year":"2016","journal-title":"Bioinformatics"},{"key":"pcbi.1011824.ref028","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1080\/07391102.2015.1032554","article-title":"14 Quantitative modeling of transcription factor binding specificities using DNA shape [Internet].","volume":"33","author":"T Zhou","year":"2015","journal-title":"Journal of Biomolecular Structure and Dynamics"},{"key":"pcbi.1011824.ref029","first-page":"100","article-title":"Learning scikit-learn: Machine Learning in Python.","author":"R Garreta","year":"2013","journal-title":"Packt Publishing Ltd;"},{"issue":"17","key":"pcbi.1011824.ref030","first-page":"1","article-title":"Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning.","volume":"18","author":"G Lema\u00eetre","year":"2017","journal-title":"J Mach Learn Res."},{"key":"pcbi.1011824.ref031","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool [Internet].","volume":"215","author":"SF Altschul","year":"1990","journal-title":"Journal of Molecular Biology"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1011824","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T00:00:00Z","timestamp":1706745600000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011824","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T13:54:22Z","timestamp":1706795662000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011824"}},"subtitle":[],"editor":[{"given":"Sunil","family":"Laxman","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,1,22]]},"references-count":31,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,1,22]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1011824","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.02.20.481200","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,1,22]]}}}