{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,8]],"date-time":"2026-05-08T08:02:51Z","timestamp":1778227371789,"version":"3.51.4"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,5,10]],"date-time":"2022-05-10T00:00:00Z","timestamp":1652140800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,5,10]],"date-time":"2022-05-10T00:00:00Z","timestamp":1652140800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002322","name":"Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00edvel Superior","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005739","name":"Universidad Nacional Aut\u00f3noma de M\u00e9xico","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100005739","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Department of Biotechnology, Govt. of India"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Archaea are a vast and unexplored domain. Bioinformatic techniques might enlighten the path to a higher quality genome annotation in varied organisms. Promoter sequences of archaea have the action of a plethora of proteins upon it. The conservation found in a structural level of the binding site of proteins such as TBP, TFB, and TFE aids RNAP-DNA stabilization and makes the archaeal promoter prone to be explored by statistical and machine learning techniques.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results and discussions<\/jats:title>\n                <jats:p>In this study, experimentally verified promoter sequences of the organisms <jats:italic>Haloferax volcanii<\/jats:italic>, <jats:italic>Sulfolobus solfataricus<\/jats:italic>, and <jats:italic>Thermococcus kodakarensis<\/jats:italic> were converted into DNA duplex stability attributes (<jats:italic>i.e.<\/jats:italic> numerical variables) and were classified through Artificial Neural Networks and an in-house statistical method of classification, being tested with three forms of controls. The recognition of these promoters enabled its use to validate unannotated promoter sequences in other organisms. As a result, the binding site of basal transcription factors was located through a DNA duplex stability codification. Additionally, the classification presented satisfactory results (above 90%) among varied levels of control.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Concluding remarks<\/jats:title>\n                <jats:p>The classification models were employed to perform genomic annotation into the archaea <jats:italic>Aciduliprofundum boonei<\/jats:italic> and <jats:italic>Thermofilum pendens<\/jats:italic>, from which potential promoters have been identified and uploaded into public repositories.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-022-04714-x","type":"journal-article","created":{"date-parts":[[2022,5,10]],"date-time":"2022-05-10T07:02:59Z","timestamp":1652166179000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["Machine learning and statistics shape a novel path in archaeal promoter annotation"],"prefix":"10.1186","volume":"23","author":[{"given":"Gustavo Sganzerla","family":"Martinez","sequence":"first","affiliation":[]},{"given":"Ernesto","family":"P\u00e9rez-Rueda","sequence":"additional","affiliation":[]},{"given":"Sharmilee","family":"Sarkar","sequence":"additional","affiliation":[]},{"given":"Aditya","family":"Kumar","sequence":"additional","affiliation":[]},{"given":"Scheila","family":"de \u00c1vila e Silva","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,5,10]]},"reference":[{"key":"4714_CR1","doi-asserted-by":"publisher","DOI":"10.1038\/371695a0","author":"EF DeLong","year":"1994","unstructured":"DeLong EF, Wu KY, Pr\u00e9zelin BB, Jovine RVM. High abundance of Archaea in Antarctic marine picoplankton. Nature. 1994. https:\/\/doi.org\/10.1038\/371695a0.","journal-title":"Nature"},{"key":"4714_CR2","doi-asserted-by":"publisher","DOI":"10.1038\/s41564-020-0715-z","author":"BJ Baker","year":"2020","unstructured":"Baker BJ, De Anda V, Seitz KW, Dombrowski N, Santoro AE, Lloyd KG. Diversity, ecology and evolution of Archaea. Nat Microbiol. 2020. https:\/\/doi.org\/10.1038\/s41564-020-0715-z.","journal-title":"Nat Microbiol"},{"key":"4714_CR3","doi-asserted-by":"publisher","DOI":"10.1155\/2006\/629868","author":"RMR Coulson","year":"2007","unstructured":"Coulson RMR, Touboul N, Ouzounis CA. Lineage-specific partitions in archaeal transcription. Archaea. 2007. https:\/\/doi.org\/10.1155\/2006\/629868.","journal-title":"Archaea"},{"key":"4714_CR4","doi-asserted-by":"publisher","DOI":"10.1111\/j.1574-6976.2011.00265.x","author":"JA Leigh","year":"2011","unstructured":"Leigh JA, Albers SV, Atomi H, Allers T. Model organisms for genetics in the domain Archaea: Methanogens, halophiles, Thermococcales and Sulfolobales. FEMS Microbiol Rev. 2011. https:\/\/doi.org\/10.1111\/j.1574-6976.2011.00265.x.","journal-title":"FEMS Microbiol Rev"},{"issue":"6","key":"4714_CR5","doi-asserted-by":"publisher","first-page":"1395","DOI":"10.1111\/j.1365-2958.2007.05876.x","volume":"65","author":"F Werner","year":"2007","unstructured":"Werner F. Structure and function of archaeal RNA polymerases. Mol Microbiol. 2007;65(6):1395\u2013404.","journal-title":"Mol Microbiol"},{"key":"4714_CR6","doi-asserted-by":"publisher","DOI":"10.1038\/nrmicro.2017.133","author":"L Eme","year":"2017","unstructured":"Eme L, Spang A, Lombard J, Stairs CW, Ettema TJG. Archaea and the origin of eukaryotes. Nat Rev Microbiol. 2017. https:\/\/doi.org\/10.1038\/nrmicro.2017.133.","journal-title":"Nat Rev Microbiol"},{"key":"4714_CR7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/978-3-319-65795-0_1","volume-title":"RNA metabolism and Gene Expression in Archaea","author":"K Smollett","year":"2017","unstructured":"Smollett K, Blombach F, Fouqueau T, Werner F. A global characterisation of the Archaeal transcription machinery. In: Clouet-d\u2019Orval B, editor. RNA metabolism and Gene Expression in Archaea. Springer; 2017. p. 1\u201326. https:\/\/doi.org\/10.1007\/978-3-319-65795-0_1."},{"key":"4714_CR8","doi-asserted-by":"publisher","DOI":"10.1042\/ETLS20180014","author":"T Fouqueau","year":"2018","unstructured":"Fouqueau T, Blombach F, Cackett G, Carty AE, Matelska DM, Ofer S, Pilotto S, Phung DK, Werner F. The cutting edge of archaeal transcription. Emerg Top Life Sci. 2018. https:\/\/doi.org\/10.1042\/ETLS20180014.","journal-title":"Emerg Top Life Sci"},{"key":"4714_CR9","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-genet-120116-023413","author":"M Martinez-Pastor","year":"2017","unstructured":"Martinez-Pastor M, Tonner PD, Darnell CL, Schmid AK. Transcriptional regulation in Archaea: from individual genes to global regulatory networks. Annu Rev Genet. 2017. https:\/\/doi.org\/10.1146\/annurev-genet-120116-023413.","journal-title":"Annu Rev Genet"},{"key":"4714_CR10","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1046\/j.1365-2958.1999.01273.x","volume":"31","author":"J Soppa","year":"1999","unstructured":"Soppa J. Transcription initiation in Archaea: facts, factors and future aspects. Mol Microbiol. 1999;31:5. https:\/\/doi.org\/10.1046\/j.1365-2958.1999.01273.x.","journal-title":"Mol Microbiol"},{"key":"4714_CR11","doi-asserted-by":"publisher","DOI":"10.1038\/s41580-018-0028-8","author":"V Haberle","year":"2018","unstructured":"Haberle V, Stark A. Eukaryotic core promoters and the functional basis of transcription initiation. Nat Rev Mol Cell Biol. 2018. https:\/\/doi.org\/10.1038\/s41580-018-0028-8.","journal-title":"Nat Rev Mol Cell Biol"},{"key":"4714_CR12","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1002\/wdev.21","volume":"1","author":"JT Kadonaga","year":"2012","unstructured":"Kadonaga JT. Perspectives on the RNA polymerase II core promoter. Wiley Interdiscipl Rev Dev Biol. 2012;1:40.","journal-title":"Wiley Interdiscipl Rev Dev Biol"},{"key":"4714_CR13","doi-asserted-by":"publisher","DOI":"10.1186\/s12864-016-2920-y","author":"J Babski","year":"2016","unstructured":"Babski J, Haas KA, N\u00e4ther-Schindler D, Pfeiffer F, F\u00f6rstner KU, Hammelmann M, Hilker R, Becker A, Sharma CM, Marchfelder A, Soppa J. Genome-wide identification of transcriptional start sites in the haloarchaeon Haloferax volcanii based on differential RNA-Seq (dRNA-Seq). BMC Genom. 2016. https:\/\/doi.org\/10.1186\/s12864-016-2920-y.","journal-title":"BMC Genom"},{"key":"4714_CR14","doi-asserted-by":"publisher","unstructured":"She Q, Singh RK, Confalonieri F, Zivanovic Y, Allard G, Awayez MJ, Christina CY, Clausen IG, Curtis BA, De Moors A, Erauso G, Van Der Oostg J. The complete genome of the crenarchaeon Sulfolobus solfataricus P2. Proceedings of the national academy of sciences of the United States of America, 2001.https:\/\/doi.org\/10.1073\/pnas.141222098","DOI":"10.1073\/pnas.141222098"},{"key":"4714_CR15","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2164-15-684","author":"D J\u00e4ger","year":"2014","unstructured":"J\u00e4ger D, F\u00f6rstner KU, Sharma CM, Santangelo TJ, Reeve JN. Primary transcriptome map of the hyperthermophilic archaeon Thermococcus kodakarensis. BMC Genom. 2014. https:\/\/doi.org\/10.1186\/1471-2164-15-684.","journal-title":"BMC Genom"},{"key":"4714_CR16","doi-asserted-by":"publisher","DOI":"10.1038\/79020","author":"MS Bartlett","year":"2000","unstructured":"Bartlett MS, Thomm M, Geiduschek EP. The orientation of DNA in an archaeal transcription initiation complex. Nat Struct Biol. 2000. https:\/\/doi.org\/10.1038\/79020.","journal-title":"Nat Struct Biol"},{"key":"4714_CR17","doi-asserted-by":"publisher","DOI":"10.3389\/fgene.2019.00286","author":"M Oubounyt","year":"2019","unstructured":"Oubounyt M, Louadi Z, Tayara H, To Chong K. Deepromoter: robust promoter predictor using deep learning. Front Genet. 2019. https:\/\/doi.org\/10.3389\/fgene.2019.00286.","journal-title":"Front Genet"},{"key":"4714_CR18","doi-asserted-by":"publisher","DOI":"10.1142\/S0219720018400036","author":"A Ryasik","year":"2018","unstructured":"Ryasik A, Orlov M, Zykova E, Ermak T, Sorokin A. Bacterial promoter prediction: selection of dynamic and static physical properties of DNA for reliable sequence classification. J Bioinform Comput Biol. 2018. https:\/\/doi.org\/10.1142\/S0219720018400036.","journal-title":"J Bioinform Comput Biol"},{"key":"4714_CR19","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-018-22129-8","author":"VR Yella","year":"2018","unstructured":"Yella VR, Kumar A, Bansal M. Identification of putative promoters in 48 eukaryotic genomes on the basis of DNA free energy. Sci Rep. 2018. https:\/\/doi.org\/10.1038\/s41598-018-22129-8.","journal-title":"Sci Rep"},{"key":"4714_CR20","doi-asserted-by":"publisher","DOI":"10.1007\/s42452-021-04713-210.1007\/s42452-021-04713-2","author":"GS Martinez","year":"2021","unstructured":"Martinez GS, de \u00c1vila e Silva S, Kumar A, P\u00e9rez-Rueda E. DNA structural and physical properties reveal peculiarities in promoter sequences of the bacterium Escherichia coli K-12. SN Appl Sci. 2021. https:\/\/doi.org\/10.1007\/s42452-021-04713-210.1007\/s42452-021-04713-2.","journal-title":"SN Appl Sci"},{"key":"4714_CR21","doi-asserted-by":"publisher","DOI":"10.1146\/annurev.biophys.32.110601.141800","author":"J SantaLucia","year":"2004","unstructured":"SantaLucia J, Hicks D. The Thermodynamics of DNA structural motifs. Annu Rev Biophys Biomol Struct. 2004. https:\/\/doi.org\/10.1146\/annurev.biophys.32.110601.141800.","journal-title":"Annu Rev Biophys Biomol Struct"},{"key":"4714_CR22","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gki627","author":"A Kanhere","year":"2005","unstructured":"Kanhere A, Bansal M. Structural properties of promoters: Similarities and differences between prokaryotes and eukaryotes. Nucleic Acids Res. 2005. https:\/\/doi.org\/10.1093\/nar\/gki627.","journal-title":"Nucleic Acids Res"},{"key":"4714_CR23","doi-asserted-by":"publisher","DOI":"10.1016\/j.jtbi.2011.07.017","author":"S de Avila e Silva","year":"2011","unstructured":"de Avila e Silva S, Echeverrigaray S, Gerhardt GJL. BacPP: bacterial promoter prediction-a tool for accurate sigma-factor specific assignment in enterobacteria. J Theor Biol. 2011. https:\/\/doi.org\/10.1016\/j.jtbi.2011.07.017.","journal-title":"J Theor Biol"},{"key":"4714_CR24","doi-asserted-by":"publisher","DOI":"10.1111\/j.2517-6161.1974.tb00994.x","author":"M Stone","year":"1974","unstructured":"Stone M. Cross-Validatory choice and assessment of statistical predictions. J Roy Stat Soc Ser B. 1974. https:\/\/doi.org\/10.1111\/j.2517-6161.1974.tb00994.x.","journal-title":"J Roy Stat Soc Ser B"},{"key":"4714_CR25","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v085.i11","author":"MW Beck","year":"2018","unstructured":"Beck MW. NeuralNetTools: visualization and analysis tools for neural networks. J Stat Soft. 2018. https:\/\/doi.org\/10.18637\/jss.v085.i11.","journal-title":"J Stat Soft"},{"key":"4714_CR26","doi-asserted-by":"publisher","DOI":"10.1016\/j.biosystems.2020.104218","author":"X Liu","year":"2020","unstructured":"Liu X, Guo Z, He T, Ren M. Prediction and analysis of prokaryotic promoters based on sequence features. BioSystems. 2020. https:\/\/doi.org\/10.1016\/j.biosystems.2020.104218.","journal-title":"BioSystems"},{"key":"4714_CR27","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1992.4.1.1","author":"S Geman","year":"1992","unstructured":"Geman S, Bienenstock E, Doursat R. Neural networks and the bias\/variance dilemma. Neural Comput. 1992. https:\/\/doi.org\/10.1162\/neco.1992.4.1.1.","journal-title":"Neural Comput"},{"key":"4714_CR28","first-page":"485","volume":"19","author":"S Afaq","year":"2020","unstructured":"Afaq S, Rao S. Significance of epochs on training a neural network. Int J Sci Technol Res. 2020;19:485.","journal-title":"Int J Sci Technol Res"},{"issue":"5","key":"4714_CR29","doi-asserted-by":"publisher","first-page":"e1230","DOI":"10.1002\/mbo3.1230","volume":"10","author":"GS Martinez","year":"2021","unstructured":"Martinez GS, Sarkar S, Kumar A, P\u00e9rez-Rueda E, de Avila e Silva S. Characterization of promoters in archaeal genomes based on DNA structural parameters. MicrobiologyOpen. 2021;10(5):e1230. https:\/\/doi.org\/10.1002\/mbo3.1230.","journal-title":"MicrobiologyOpen"},{"key":"4714_CR30","doi-asserted-by":"publisher","DOI":"10.1128\/JB.183.5.1813-1818.2001","author":"BL Hanzelka","year":"2001","unstructured":"Hanzelka BL, Darcy TJ, Reeve JN. TFE, an archaeal transcription factor in methanobacterium thermoautotrophicum related to eucaryal transcription factor TFIIE\u03b1. J Bacteriol. 2001. https:\/\/doi.org\/10.1128\/JB.183.5.1813-1818.2001.","journal-title":"J Bacteriol"},{"key":"4714_CR31","doi-asserted-by":"publisher","DOI":"10.1128\/AEM.01005-10","author":"R Takemasa","year":"2011","unstructured":"Takemasa R, Yokooji Y, Yamatsu A, Atomi H, Imanaka T. Thermococcus kodakarensis as a host for gene expression and protein secretion. Appl Environ Microbiol. 2011. https:\/\/doi.org\/10.1128\/AEM.01005-10.","journal-title":"Appl Environ Microbiol"},{"key":"4714_CR32","doi-asserted-by":"publisher","unstructured":"Kumar P, Ambekar S, Kumar M, Roy S. Data mining - methods applications and systems, 2020. https:\/\/doi.org\/10.5772\/intechopen.87784","DOI":"10.5772\/intechopen.87784"},{"key":"4714_CR33","doi-asserted-by":"publisher","unstructured":"Mangal R, Nori AV, Orso A. Robustness of neural networks: A probabilistic and practical approach. Proceedings - 2019 IEEE\/ACM 41st international conference on software engineering: new ideas and emerging results, ICSE-NIER 2019. https:\/\/doi.org\/10.1109\/ICSE-NIER.2019.00032","DOI":"10.1109\/ICSE-NIER.2019.00032"},{"key":"4714_CR34","doi-asserted-by":"publisher","DOI":"10.1016\/j.jtbi.2010.01.013","author":"Y Xu","year":"2010","unstructured":"Xu Y, Wang XB, Ding J, Wu LY, Deng NY. Lysine acetylation sites prediction using an ensemble of support vector machine classifiers. J Theor Biol. 2010. https:\/\/doi.org\/10.1016\/j.jtbi.2010.01.013.","journal-title":"J Theor Biol"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-022-04714-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-022-04714-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-022-04714-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,10]],"date-time":"2022-05-10T07:03:48Z","timestamp":1652166228000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-022-04714-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,10]]},"references-count":34,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["4714"],"URL":"https:\/\/doi.org\/10.1186\/s12859-022-04714-x","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,10]]},"assertion":[{"value":"16 December 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 May 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 May 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"171"}}