{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,5]],"date-time":"2026-05-05T20:19:48Z","timestamp":1778012388730,"version":"3.51.4"},"reference-count":61,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2021,5,29]],"date-time":"2021-05-29T00:00:00Z","timestamp":1622246400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002322","name":"CAPES","doi-asserted-by":"publisher","award":["001"],"award-info":[{"award-number":["001"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002322","name":"CAPES","doi-asserted-by":"publisher","award":["DS-1454337"],"award-info":[{"award-number":["DS-1454337"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002322","name":"CAPES","doi-asserted-by":"publisher","award":["DS-1560211"],"award-info":[{"award-number":["DS-1560211"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003593","name":"Conselho Nacional de Desenvolvimento Cient\u00edfico e Tecnol\u00f3gico","doi-asserted-by":"publisher","award":["304360\/2014-7"],"award-info":[{"award-number":["304360\/2014-7"]}],"id":[{"id":"10.13039\/501100003593","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,11,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Promoter annotation is an important task in the analysis of a genome. One of the main challenges for this task is locating the border between the promoter region and the transcribing region of the gene, the transcription start site (TSS). The TSS is the reference point to delimit the DNA sequence responsible for the assembly of the transcribing complex. As the same gene can have more than one TSS, so to delimit the promoter region, it is important to locate the closest TSS to the site of the beginning of the translation. This paper presents TSSFinder, a new software for the prediction of the TSS signal of eukaryotic genes that is significantly more accurate than other available software. We currently are the only application to offer pre-trained models for six different eukaryotic organisms: Arabidopsis thaliana, Drosophila melanogaster, Gallus gallus, Homo sapiens, Oryza sativa and Saccharomyces cerevisiae. Additionally, our software can be easily customized for specific organisms using only 125 DNA sequences with a validated TSS signal and corresponding genomic locations as a training set. TSSFinder is a valuable new tool for the annotation of genomes. TSSFinder source code and docker container can be downloaded from http:\/\/tssfinder.github.io. Alternatively, TSSFinder is also available as a web service at http:\/\/sucest-fun.org\/wsapp\/tssfinder\/.<\/jats:p>","DOI":"10.1093\/bib\/bbab198","type":"journal-article","created":{"date-parts":[[2021,5,4]],"date-time":"2021-05-04T21:25:59Z","timestamp":1620163559000},"source":"Crossref","is-referenced-by-count":17,"title":["TSSFinder\u2014fast and accurate<i>ab initio<\/i>prediction of the core promoter in eukaryotic genomes"],"prefix":"10.1093","volume":"22","author":[{"given":"Mauro","family":"de Medeiros Oliveira","sequence":"first","affiliation":[{"name":"Instituto Carlos Chagas, Funda\u00e7\u00e3o Oswaldo Cruz, Paran\u00e1, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Igor","family":"Bonadio","sequence":"additional","affiliation":[{"name":"Data Science, Elo7 Research Lab, S\u00e3o Paulo, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alicia","family":"Lie de Melo","sequence":"additional","affiliation":[{"name":"Biochemistry Department, Universidade de S\u00e3o Paulo, S\u00e3o Paulo, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Glaucia","family":"Mendes Souza","sequence":"additional","affiliation":[{"name":"Biochemistry Department, Universidade de S\u00e3o Paulo, S\u00e3o Paulo, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alan Mitchell","family":"Durham","sequence":"additional","affiliation":[{"name":"Computer Science, Universidade de S\u00e3o Paulo, S\u00e3o Paulo, Brazil"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2021,5,31]]},"reference":[{"issue":"4","key":"2021110814400534400_ref1","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1038\/nrg3163","article-title":"Metazoan promoters: emerging characteristics and insights into transcriptional regulation","volume":"13","author":"Lenhard","year":"2012","journal-title":"Nat Rev Genet"},{"issue":"1","key":"2021110814400534400_ref2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-018-22129-8","article-title":"Identification of putative promoters in 48 eukaryotic genomes on the basis of DNA free energy","volume":"8","author":"Yella","year":"2018","journal-title":"Sci Rep"},{"issue":"3","key":"2021110814400534400_ref3","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1016\/j.tibs.2015.01.007","article-title":"Core promoters in transcription: old problem, new insights","volume":"40","author":"Roy","year":"2015","journal-title":"Trends Biochem Sci"},{"issue":"8","key":"2021110814400534400_ref4","first-page":"e65","article-title":"TSSPlant: a new tool for prediction of plant Pol II promoters","volume":"45","author":"Shahmuradov","year":"2017","journal-title":"Nucleic Acids Res"},{"issue":"18","key":"2021110814400534400_ref5","doi-asserted-by":"crossref","first-page":"2013","DOI":"10.1101\/gad.1951110","article-title":"The TCT motif, a key component of an RNA polymerase II transcription system for the translational machinery","volume":"24","author":"Parry","year":"2010","journal-title":"Genes Dev"},{"key":"2021110814400534400_ref6","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4939-6396-6","volume-title":"Plant Synthetic Promoters: Methods and Protocols","author":"Hehl","year":"2016"},{"key":"2021110814400534400_ref7","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.copbio.2015.10.001","article-title":"Plant synthetic promoters and transcription factors","volume":"37","author":"Liu","year":"2016","journal-title":"Curr Opin Biotechnol"},{"issue":"3","key":"2021110814400534400_ref8","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1016\/j.synbio.2017.09.003","article-title":"Transcription control engineering and applications in synthetic biology","volume":"2","author":"Engstrom","year":"2017","journal-title":"Synth Syst Biotechnol"},{"issue":"12","key":"2021110814400534400_ref9","doi-asserted-by":"crossref","first-page":"3309","DOI":"10.1105\/tpc.15.00630","article-title":"Core promoter plasticity between maize tissues and genotypes contrasts with predominance of sharp transcription initiation sites","volume":"27","author":"Mej\u00eda-Guerra","year":"2015","journal-title":"Plant Cell"},{"issue":"10","key":"2021110814400534400_ref10","doi-asserted-by":"crossref","first-page":"e79011","DOI":"10.1371\/journal.pone.0079011","article-title":"Genome-wide computational prediction and analysis of core promoter elements across plant monocots and dicots","volume":"8","author":"Kumari","year":"2013","journal-title":"PLoS One"},{"key":"2021110814400534400_ref11","doi-asserted-by":"crossref","first-page":"D75","DOI":"10.1093\/nar\/gkp902","article-title":"Utrdb and utrsite (release 2010): a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs","volume":"38","author":"Grillo","year":"2010","journal-title":"Nucleic Acids Res"},{"issue":"2","key":"2021110814400534400_ref12","first-page":"142","volume":"22","author":"Gordon","year":"2006","journal-title":"Improved prediction of bacterial transcription start sites"},{"issue":"12","key":"2021110814400534400_ref13","doi-asserted-by":"crossref","first-page":"i313","DOI":"10.1093\/bioinformatics\/btp191","article-title":"Toward a gold standard for promoter prediction evaluation","volume":"25","author":"Abeel","year":"2009","journal-title":"Bioinformatics"},{"issue":"4","key":"2021110814400534400_ref14","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1093\/bfgp\/elp014","article-title":"Identifying regulatory elements in eukaryotic genomes","volume":"8","author":"Narlikar","year":"2009","journal-title":"Brief Funct Genom Proteom"},{"issue":"3","key":"2021110814400534400_ref15","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1093\/bioinformatics\/btw630","article-title":"Pro54db: a database for experimentally verified sigma-54 promoters","volume":"33","author":"Liang","year":"2017","journal-title":"Bioinformatics"},{"issue":"17","key":"2021110814400534400_ref16","doi-asserted-by":"crossref","first-page":"2957","DOI":"10.1093\/bioinformatics\/btz016","article-title":"Multiply: a novel multi-layer predictor for discovering general and specific types of promoters","volume":"35","author":"Zhang","year":"2019","journal-title":"Bioinformatics"},{"issue":"19","key":"2021110814400534400_ref17","doi-asserted-by":"crossref","first-page":"4869","DOI":"10.1093\/bioinformatics\/btaa609","article-title":"iPromoter-BnCNN: a novel branched CNN-based predictor for identifying and classifying sigma promoters","volume":"36","author":"Amin","year":"2020","journal-title":"Bioinformatics"},{"key":"2021110814400534400_ref18","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1016\/j.omtn.2019.05.028","article-title":"iProEP: a computational predictor for predicting promoter","volume":"17","author":"Lai","year":"2019","journal-title":"Mol Ther Nucleic Acids"},{"issue":"2","key":"2021110814400534400_ref19","doi-asserted-by":"crossref","first-page":"2126","DOI":"10.1093\/bib\/bbaa049","article-title":"Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework","volume":"22","author":"Li","year":"2021","journal-title":"Brief Bioinform"},{"issue":"D1","key":"2021110814400534400_ref20","doi-asserted-by":"crossref","first-page":"D51","DOI":"10.1093\/nar\/gkw1069","article-title":"The eukaryotic promoter database in its 30th year: focus on non-vertebrate organisms","volume":"45","author":"Dreos","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2021110814400534400_ref21","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-60761-854-6","volume-title":"Computational Biology of Transcription Factor Binding","author":"Ladunga","year":"2010"},{"issue":"9","key":"2021110814400534400_ref22","doi-asserted-by":"crossref","first-page":"967","DOI":"10.1101\/gr.8.9.967","article-title":"A computer program for aligning a cDNA sequence with a genomic DNA sequence","volume":"8","author":"Florea","year":"1998","journal-title":"Genome Res"},{"issue":"4","key":"2021110814400534400_ref23","doi-asserted-by":"crossref","first-page":"656","DOI":"10.1101\/gr.229202","article-title":"BLAT\u2014the BLAST-like alignment tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res"},{"issue":"9","key":"2021110814400534400_ref24","doi-asserted-by":"crossref","first-page":"1859","DOI":"10.1093\/bioinformatics\/bti310","article-title":"GMAP: a genomic mapping and alignment program for mRNA and EST sequences","volume":"21","author":"Wu","year":"2005","journal-title":"Bioinformatics"},{"issue":"1","key":"2021110814400534400_ref25","first-page":"1","article-title":"An efficient full-length cDNA amplification strategy based on bioinformatics technology and multiplexed PCR methods","volume":"5","author":"Chen","year":"2016","journal-title":"Sci Rep"},{"issue":"6","key":"2021110814400534400_ref26","doi-asserted-by":"crossref","first-page":"e0157779","DOI":"10.1371\/journal.pone.0157779","article-title":"cDNA library enrichment of full length transcripts for SMRT long read sequencing","volume":"11","author":"Cartolano","year":"2016","journal-title":"PLoS One"},{"key":"2021110814400534400_ref27","first-page":"182","article-title":"Characterization of prokaryotic and eukaryotic promoters using hidden Markov models. Proceedings of the International Conference on Intelligent Systems for Molecular Biology","volume-title":"Saint Louis, Missouri","author":"Pedersen","year":"1996"},{"issue":"5","key":"2021110814400534400_ref28","doi-asserted-by":"crossref","first-page":"923","DOI":"10.1006\/jmbi.1995.0349","article-title":"Predicting Pol II promoter sequences using transcription factor binding sites","volume":"249","author":"Prestridge","year":"1995","journal-title":"J Mol Biol"},{"key":"2021110814400534400_ref29","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1007\/978-1-60761-854-6_5","article-title":"Identification of promoter regions and regulatory sites","volume-title":"Computational Biology of Transcription Factor Binding","author":"Solovyev","year":"2010"},{"issue":"5","key":"2021110814400534400_ref30","doi-asserted-by":"crossref","first-page":"391","DOI":"10.1093\/bioinformatics\/12.5.391","article-title":"The prediction of vertebrate promoter regions using differential hexamer frequency analysis","volume":"12","author":"Hutchinson","year":"1996","journal-title":"Bioinformatics"},{"key":"2021110814400534400_ref31","article-title":"Computational identification of eukaryotic promoters based on cascaded deep capsule neural networks","author":"Zhu","year":"2020","journal-title":"Brief Bioinform"},{"issue":"5","key":"2021110814400534400_ref32","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1016\/j.compbiolchem.2008.07.009","article-title":"The cross-species prediction of bacterial promoters using a support vector machine","volume":"32","author":"Towsey","year":"2008","journal-title":"Comput Biol Chem"},{"issue":"1","key":"2021110814400534400_ref33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-018-36308-0","article-title":"Image-based promoter prediction: a promoter prediction method based on evolutionarily generated patterns","volume":"8","author":"Wang","year":"2018","journal-title":"Sci Rep"},{"issue":"13","key":"2021110814400534400_ref34","doi-asserted-by":"crossref","first-page":"3560","DOI":"10.1093\/nar\/gkg570","article-title":"Dragon Gene Start Finder identifies approximate locations of the 5\u2019 ends of genes","volume":"31","author":"Bajic","year":"2003","journal-title":"Nucleic Acids Res"},{"issue":"14","key":"2021110814400534400_ref35","doi-asserted-by":"crossref","first-page":"e472","DOI":"10.1093\/bioinformatics\/btl250","article-title":"ARTS: accurate recognition of transcription starts in human","volume":"22","author":"Sonnenburg","year":"2006","journal-title":"Bioinformatics"},{"issue":"13","key":"2021110814400534400_ref36","doi-asserted-by":"crossref","first-page":"i24","DOI":"10.1093\/bioinformatics\/btn172","article-title":"ProSOM: core promoter prediction based on unsupervised clustering of DNA physical profiles","volume":"24","author":"Abeel","year":"2008","journal-title":"Bioinformatics"},{"key":"2021110814400534400_ref37","article-title":"Benchmarking available bacterial promoter prediction tools: potentialities and limitations","volume-title":"bioRxiv","author":"Cassiano","year":"2020"},{"key":"2021110814400534400_ref38","article-title":"Computational identification of eukaryotic promoters based on cascaded deep capsule neural networks","author":"Zhu","year":"2020","journal-title":"Brief Bioinform"},{"issue":"7","key":"2021110814400534400_ref39","doi-asserted-by":"crossref","first-page":"2746","DOI":"10.1105\/tpc.114.125617","article-title":"Paired-end analysis of transcription start sites in Arabidopsis reveals plant-specific promoter signatures","volume":"26","author":"Morton","year":"2014","journal-title":"Plant Cell"},{"issue":"23","key":"2021110814400534400_ref40","doi-asserted-by":"crossref","first-page":"3725","DOI":"10.1093\/bioinformatics\/btv464","article-title":"TIPR: transcription initiation pattern recognition on a genome scale","volume":"31","author":"Morton","year":"2015","journal-title":"Bioinformatics"},{"issue":"11","key":"2021110814400534400_ref41","first-page":"1","article-title":"Transprise: a novel machine learning approach for eukaryotic promoter prediction","volume":"2019","author":"Pachganov","year":"2019","journal-title":"PeerJ"},{"issue":"1\u20132","key":"2021110814400534400_ref42","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1016\/j.artmed.2005.02.005","article-title":"Computational modeling of oligonucleotide positional densities for human promoter prediction","volume":"35","author":"Narang","year":"2005","journal-title":"Artif Intell Med"},{"key":"2021110814400534400_ref43","first-page":"282","article-title":"Conditional random fields: probabilistic models for segmenting and labeling sequence data","volume-title":"Machine Learning-International Workshop then conference","author":"Lafferty","year":"2001"},{"key":"2021110814400534400_ref44","first-page":"1441","article-title":"Comparative gene prediction using conditional random fields","volume":"2017","author":"Vinson","year":"2006","journal-title":"AdvNeural Inf Process Syst"},{"issue":"12","key":"2021110814400534400_ref45","doi-asserted-by":"crossref","first-page":"1571","DOI":"10.1093\/bioinformatics\/bts176","article-title":"Automated gene-model curation using global discriminative learning","volume":"28","author":"Bernal","year":"2012","journal-title":"Bioinformatics"},{"issue":"9","key":"2021110814400534400_ref46","doi-asserted-by":"crossref","first-page":"1389","DOI":"10.1101\/gr.6558107","article-title":"Gene prediction using conditional random fields","volume":"17","author":"DeCaprio","year":"2007","journal-title":"Genome Res"},{"key":"2021110814400534400_ref47","first-page":"868","article-title":"Teamdl at semeval-2018 task 8: cybersecurity text analysis using convolutional neural network and conditional random fields","volume-title":"Proceedings of The 12th International Workshop on Semantic Evaluation","author":"Ravikiran","year":"2018"},{"issue":"2","key":"2021110814400534400_ref48","doi-asserted-by":"crossref","first-page":"e6","DOI":"10.1093\/pcp\/pcs183","article-title":"Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics","volume":"54","author":"Sakai","year":"2013","journal-title":"Plant Cell Physiol"},{"issue":"14","key":"2021110814400534400_ref49","doi-asserted-by":"crossref","first-page":"1931","DOI":"10.1093\/bioinformatics\/bts293","article-title":"MotifSuite: workflow for probabilistic motif detection and assessment","volume":"28","author":"Claeys","year":"2012","journal-title":"Bioinformatics"},{"issue":"D1","key":"2021110814400534400_ref50","doi-asserted-by":"crossref","first-page":"D260","DOI":"10.1093\/nar\/gkx1126","article-title":"JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework","volume":"46","author":"Khan","year":"2017","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2021110814400534400_ref51","doi-asserted-by":"crossref","first-page":"D754","DOI":"10.1093\/nar\/gkx1098","article-title":"Ensembl 2018","volume":"46","author":"Zerbino","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2021110814400534400_ref52","first-page":"309","article-title":"Identifying CPG islands in genome using conditional random fields","volume-title":"International Conference on Intelligent Computing","author":"Liu","year":"2012"},{"issue":"4","key":"2021110814400534400_ref53","doi-asserted-by":"crossref","first-page":"186","DOI":"10.5732\/cjc.012.10112","article-title":"Detection and characterization of regulatory elements using probabilistic conditional random field and hidden Markov models","volume":"32","author":"Wang","year":"2013","journal-title":"Chinese J Cancer"},{"issue":"22","key":"2021110814400534400_ref54","doi-asserted-by":"crossref","first-page":"3143","DOI":"10.1093\/bioinformatics\/btu519","article-title":"Detection of active transcription factor binding sites with the combination of DNase hypersensitivity and histone modifications","volume":"30","author":"Gusmao","year":"2014","journal-title":"Bioinformatics"},{"issue":"8","key":"2021110814400534400_ref55","doi-asserted-by":"crossref","first-page":"S18","DOI":"10.1186\/1471-2164-13-S8-S18","article-title":"CTF: a CRF-based transcription factor binding sites finding system","volume":"13","author":"He","year":"2012","journal-title":"BMC Genomics"},{"key":"2021110814400534400_ref56","first-page":"D37","article-title":"DiProDB: a database for dinucleotide properties","volume-title":"Nucleic Acids Res","author":"Friedel","year":"2008"},{"issue":"1","key":"2021110814400534400_ref57","doi-asserted-by":"crossref","first-page":"973","DOI":"10.1186\/s12864-016-3292-z","article-title":"Structural features of DNA that determine RNA polymerase II core promoter","volume":"17","author":"Il\u2019icheva","year":"2016","journal-title":"BMC Genomics"},{"key":"2021110814400534400_ref58","doi-asserted-by":"crossref","DOI":"10.1104\/pp.110.167809","article-title":"DNA free energy based promoter prediction and comparative analysis of Arabidopsis and rice genomes","volume-title":"Plant Physiol","author":"Morey","year":"2011"},{"key":"2021110814400534400_ref59","first-page":"e1004418","article-title":"Contribution of sequence motif, chromatin state, and DNA structure features to predictive models of transcription factor binding in yeast","volume-title":"PLoS Comput Biol","author":"Tsai","year":"2015"},{"issue":"10","key":"2021110814400534400_ref60","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1003234","article-title":"ToPS: a framework to manipulate probabilistic models of sequence data","volume":"9","author":"Kashiwabara","year":"2013","journal-title":"PLoS Comput Biol"},{"issue":"6","key":"2021110814400534400_ref61","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1093\/bioinformatics\/btq033","article-title":"Ira M Hall. BEDTools: a flexible suite of utilities for comparing genomic features","volume":"26","author":"Quinlan","year":"2010","journal-title":"Bioinformatics"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/22\/6\/bbab198\/41087809\/bbab198.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/22\/6\/bbab198\/41087809\/bbab198.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,3]],"date-time":"2023-11-03T07:41:22Z","timestamp":1698997282000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbab198\/6287335"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,5,31]]},"references-count":61,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2021,11,5]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbab198","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,11]]},"published":{"date-parts":[[2021,5,31]]},"article-number":"bbab198"}}