{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,17]],"date-time":"2026-01-17T02:57:43Z","timestamp":1768618663504,"version":"3.49.0"},"reference-count":18,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2008,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Regulation of gene expression at the level of transcription is a major control point in many biological processes. Transcription factors (TFs) can activate and\/or repress the transcriptional rate of target genes and vascular plant genomes devote approximately 7% of their coding capacity to TFs. Global analysis of TFs has only been performed for three complete higher plant genomes \u2013 Arabidopsis (<jats:italic>Arabidopsis thaliana<\/jats:italic>), poplar (<jats:italic>Populus trichocarpa<\/jats:italic>) and rice (<jats:italic>Oryza sativa<\/jats:italic>). Presently, no large-scale analysis of TFs has been made from a member of the <jats:italic>Solanaceae<\/jats:italic>, one of the most important families of vascular plants. To fill this void, we have analysed tobacco (<jats:italic>Nicotiana tabacum<\/jats:italic>) TFs using a dataset of 1,159,022 gene-space sequence reads (GSRs) obtained by methylation filtering of the tobacco genome. An analytical pipeline was developed to isolate TF sequences from the GSR data set. This involved multiple (typically 10\u201315) independent searches with different versions of the TF family-defining domain(s) (normally the DNA-binding domain) followed by assembly into contigs and verification. Our analysis revealed that tobacco contains a minimum of 2,513 TFs representing all of the 64 well-characterised plant TF families. The number of TFs in tobacco is higher than previously reported for Arabidopsis and rice.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>TOBFAC: the database of tobacco transcription factors, is an integrative database that provides a portal to sequence and phylogeny data for the identified TFs, together with a large quantity of other data concerning TFs in tobacco. The database contains an individual page dedicated to each of the 64 TF families. These contain background information, domain architecture via Pfam links, a list of all sequences and an assessment of the minimum number of TFs in this family in tobacco. Downloadable phylogenetic trees of the major families are provided along with detailed information on the bioinformatic pipeline that was used to find all family members. TOBFAC also contains EST data, a list of published tobacco TFs and a list of papers concerning tobacco TFs. The sequences and annotation data are stored in relational tables using a PostgrelSQL relational database management system. The data processing and analysis pipelines used the Perl programming language. The web interface was implemented in JavaScript and Perl CGI running on an Apache web server. The computationally intensive data processing and analysis pipelines were run on an Apple XServe cluster with more than 20 nodes.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>TOBFAC is an expandable knowledgebase of tobacco TFs with data currently available for over 2,513 TFs from 64 gene families. TOBFAC integrates available sequence information, phylogenetic analysis, and EST data with published reports on tobacco TF function. The database provides a major resource for the study of gene expression in tobacco and the <jats:italic>Solanaceae<\/jats:italic> and helps to fill a current gap in studies of TF families across the plant kingdom. TOBFAC is publicly accessible at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"http:\/\/compsysbio.achs.virginia.edu\/tobfac\/\" ext-link-type=\"uri\">http:\/\/compsysbio.achs.virginia.edu\/tobfac\/<\/jats:ext-link>.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-9-53","type":"journal-article","created":{"date-parts":[[2008,1,25]],"date-time":"2008-01-25T19:20:45Z","timestamp":1201288845000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":70,"title":["TOBFAC: the database of tobacco transcription factors"],"prefix":"10.1186","volume":"9","author":[{"given":"Paul J","family":"Rushton","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marta T","family":"Bokowiec","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Thomas W","family":"Laudeman","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jennifer F","family":"Brannock","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xianfeng","family":"Chen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael P","family":"Timko","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2008,1,25]]},"reference":[{"issue":"4","key":"2038_CR1","doi-asserted-by":"publisher","first-page":"1375","DOI":"10.1104\/pp.010708","volume":"127","author":"DNV Geelen","year":"2001","unstructured":"Geelen DNV, Inze DG: A bright future for the bright yellow-2 cell culture. Plant Physiology 2001, 127(4):1375\u20131379. 10.1104\/pp.127.4.1375","journal-title":"Plant Physiology"},{"issue":"5653","key":"2038_CR2","doi-asserted-by":"publisher","first-page":"2115","DOI":"10.1126\/science.1091265","volume":"302","author":"LE Palmer","year":"2003","unstructured":"Palmer LE, Rabinowicz PD, O'Shaughnessy AL, Balija VS, Nascimento LU, Dike S, de la Bastide M, Martienssen RA, McCombie WR: Maize genome sequencing by methylation filtrations. Science 2003, 302(5653):2115\u20132117. 10.1126\/science.1091265","journal-title":"Science"},{"issue":"3","key":"2038_CR3","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1038\/15479","volume":"23","author":"PD Rabinowicz","year":"1999","unstructured":"Rabinowicz PD, Schutz K, Dedhia N, Yordan C, Parnell LD, Stein L, McCombie WR, Martienssen RA: Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome. Nature Genetics 1999, 23(3):305\u2013308. 10.1038\/15479","journal-title":"Nature Genetics"},{"issue":"5653","key":"2038_CR4","doi-asserted-by":"publisher","first-page":"2118","DOI":"10.1126\/science.1090047","volume":"302","author":"CA Whitelaw","year":"2003","unstructured":"Whitelaw CA, Barbazuk WB, Pertea G, Chan AP, Cheung F, Lee Y, Zheng L, van Heeringen S, Karamycheva S, Bennetzen JL, SanMiguel P, Lakey N, Bedell J, Yuan Y, Budiman MA, Resnick A, Van Aken S, Utterback T, Riedmuller S, Williams M, Feldblyum T, Schubert K, Beachy R, Fraser CM, Quackenbush J: Enrichment of gene-coding sequences in maize by genome filtration. Science 2003, 302(5653):2118\u20132120. 10.1126\/science.1090047","journal-title":"Science"},{"issue":"1","key":"2038_CR5","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1186\/1471-2105-8-129","volume":"8","author":"X Chen","year":"2007","unstructured":"Chen X, Laudeman T, Rushton P, Spraggins T, Timko M: CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L.) methylation filtered genomic genespace sequences. BMC Bioinformatics 2007, 8(1):129. 10.1186\/1471-2105-8-129","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"2038_CR6","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1371\/journal.pbio.0030013","volume":"3","author":"JA Bedell","year":"2005","unstructured":"Bedell JA, Budiman MA, Nunberg A, Citek RW, Robbins D, Jones J, Flick E, Rohlfing T, Fries J, Bradford K, McMenamy J, Smith M, Holeman H, Roe BA, Wiley G, Korf IF, Rabinowicz PD, Lakey N, McCombie WR, Jeddeloh JA, Martienssen RA: Sorghum genome sequencing by methylation filtration. Plos Biology 2005, 3(1):103\u2013115. 10.1371\/journal.pbio.0030013","journal-title":"Plos Biology"},{"issue":"4","key":"2038_CR7","doi-asserted-by":"publisher","first-page":"565","DOI":"10.1139\/g94-081","volume":"37","author":"JL Bennetzen","year":"1994","unstructured":"Bennetzen JL, Schrick K, Springer PS, Brown WE, Sanmiguel P: Active Maize Genes Are Unmodified and Flanked by Diverse Classes of Modified, Highly Repetitive DNA. Genome 1994, 37(4):565\u2013576.","journal-title":"Genome"},{"key":"2038_CR8","unstructured":"Tobacco Genome Initiative[http:\/\/www.tobaccogenome.org\/]"},{"issue":"5499","key":"2038_CR9","doi-asserted-by":"publisher","first-page":"2105","DOI":"10.1126\/science.290.5499.2105","volume":"290","author":"JL Riechmann","year":"2000","unstructured":"Riechmann JL, Heard J, Martin G, Reuber L, Jiang CZ, Keddie J, Adam L, Pineda O, Ratcliffe OJ, Samaha RR, Creelman R, Pilgrim M, Broun P, Zhang JZ, Ghandehari D, Sherman BK, Yu CL: Arabidopsis transcription factors: Genome-wide comparative analysis among eukaryotes. Science 2000, 290(5499):2105\u20132110. 10.1126\/science.290.5499.2105","journal-title":"Science"},{"issue":"10","key":"2038_CR10","doi-asserted-by":"publisher","first-page":"1286","DOI":"10.1093\/bioinformatics\/btl107","volume":"22","author":"G Gao","year":"2006","unstructured":"Gao G, Zhong Y, Guo A, Zhu Q, Tang W, Zheng W, Gu X, Wei L, Luo J: DRTF: a database of rice transcription factors. Bioinformatics 2006, 22(10):1286\u20131287. 10.1093\/bioinformatics\/btl107","journal-title":"Bioinformatics"},{"issue":"10","key":"2038_CR11","doi-asserted-by":"publisher","first-page":"2568","DOI":"10.1093\/bioinformatics\/bti334","volume":"21","author":"AY Guo","year":"2005","unstructured":"Guo AY, He K, Liu D, Bai SN, Gu XC, Wei LP, Luo JC: DATF: a database of Arabidopsis transcription factors. Bioinformatics 2005, 21(10):2568\u20132569. 10.1093\/bioinformatics\/bti334","journal-title":"Bioinformatics"},{"key":"2038_CR12","volume-title":"Bmc Bioinformatics","author":"DM Riano-Pachon","year":"2007","unstructured":"Riano-Pachon DM, Ruzicic S, Dreyer I, Mueller-Roeber B: PlnTFDB: an integrative plant transcription factor database. Bmc Bioinformatics 2007., 8:"},{"issue":"4","key":"2038_CR13","doi-asserted-by":"publisher","first-page":"1452","DOI":"10.1104\/pp.107.095760","volume":"143","author":"S Richardt","year":"2007","unstructured":"Richardt S, Lang D, Reski R, Frank W, Rensing SA: PlanTAPDB, a Phylogeny-Based Resource of Plant Transcription-Associated Proteins. Plant Physiol 2007, 143(4):1452\u20131466. 10.1104\/pp.107.095760","journal-title":"Plant Physiol"},{"key":"2038_CR14","unstructured":"Plant Transcription Factor Databases[http:\/\/planttfdb.cbi.pku.edu.cn\/]"},{"key":"2038_CR15","unstructured":"PostgreSQL[http:\/\/www.postgresql.org\/]"},{"key":"2038_CR16","unstructured":"National Center for Biotechnology Information[http:\/\/www.ncbi.nlm.nih.gov\/]"},{"issue":"8","key":"2038_CR17","doi-asserted-by":"publisher","first-page":"1596","DOI":"10.1093\/molbev\/msm092","volume":"24","author":"K Tamura","year":"2007","unstructured":"Tamura K, Dudley J, Nei M, Kumar S: MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) Software Version 4.0. Mol Biol Evol 2007, 24(8):1596\u20131599. 10.1093\/molbev\/msm092","journal-title":"Mol Biol Evol"},{"key":"2038_CR18","unstructured":"European Sequencing of Tobacco Project[http:\/\/www.estobacco.info\/]"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-9-53.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T10:59:38Z","timestamp":1630493978000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-9-53"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,1,25]]},"references-count":18,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,12]]}},"alternative-id":["2038"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-9-53","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,1,25]]},"assertion":[{"value":"27 September 2007","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 January 2008","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 January 2008","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"53"}}