{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T00:50:26Z","timestamp":1776732626602,"version":"3.51.2"},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"18","license":[{"start":{"date-parts":[[2021,3,24]],"date-time":"2021-03-24T00:00:00Z","timestamp":1616544000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01 GM103544"],"award-info":[{"award-number":["R01 GM103544"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,9,29]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Sequence motif discovery algorithms can identify novel sequence patterns that perform biological functions in DNA, RNA and protein sequences\u2014for example, the binding site motifs of DNA- and RNA-binding proteins.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>The STREME algorithm presented here advances the state-of-the-art in ab initio motif discovery in terms of both accuracy and versatility. Using in vivo DNA (ChIP-seq) and RNA (CLIP-seq) data, and validating motifs with reference motifs derived from in vitro data, we show that STREME is more accurate, sensitive and thorough than several widely used algorithms (DREME, HOMER, MEME, Peak-motifs) and two other representative algorithms (ProSampler and Weeder). STREME\u2019s capabilities include the ability to find motifs in datasets with hundreds of thousands of sequences, to find both short and long motifs (from 3 to 30 positions), to perform differential motif discovery in pairs of sequence datasets, and to find motifs in sequences over virtually any alphabet (DNA, RNA, protein and user-defined alphabets). Unlike most motif discovery algorithms, STREME reports a useful estimate of the statistical significance of each motif it discovers. STREME is easy to use individually via its web server or via the command line, and is completely integrated with the widely used MEME Suite of sequence analysis tools. The name STREME stands for \u2018Simple, Thorough, Rapid, Enriched Motif Elicitation\u2019.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The STREME web server and source code are provided freely for non-commercial use at http:\/\/meme-suite.org.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab203","type":"journal-article","created":{"date-parts":[[2021,3,23]],"date-time":"2021-03-23T16:36:48Z","timestamp":1616517408000},"page":"2834-2840","source":"Crossref","is-referenced-by-count":515,"title":["STREME: accurate and versatile sequence motif discovery"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7018-9342","authenticated-orcid":false,"given":"Timothy L","family":"Bailey","sequence":"first","affiliation":[{"name":"Department of Pharmacology, University of Nevada , Reno, NV 89557, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,3,24]]},"reference":[{"key":"2023061310574045400_btab203-B1","doi-asserted-by":"crossref","first-page":"1653","DOI":"10.1093\/bioinformatics\/btr261","article-title":"DREME: motif discovery in transcription factor ChIP-seq data","volume":"27","author":"Bailey","year":"2011","journal-title":"Bioinformatics"},{"key":"2023061310574045400_btab203-B2","first-page":"21","author":"Bailey","year":"1995"},{"key":"2023061310574045400_btab203-B3","doi-asserted-by":"crossref","first-page":"47","DOI":"10.32607\/20758251-2017-9-2-47-58","article-title":"C2h2 zinc finger proteins: the largest but poorly explored family of higher eukaryotic transcription factors","volume":"9","author":"Fedotova","year":"2017","journal-title":"Acta Nat"},{"key":"2023061310574045400_btab203-B4","doi-asserted-by":"crossref","first-page":"87","DOI":"10.2307\/2340521","article-title":"On the interpretation of \u03c72 from contingency tables, and the calculation of p","volume":"85","author":"Fisher","year":"1922","journal-title":"J. R. Stat. Soc"},{"key":"2023061310574045400_btab203-B5","doi-asserted-by":"crossref","first-page":"R24","DOI":"10.1186\/gb-2007-8-2-r24","article-title":"Quantifying similarity between motifs","volume":"8","author":"Gupta","year":"2007","journal-title":"Genome Biol"},{"key":"2023061310574045400_btab203-B6","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1016\/j.molcel.2010.05.004","article-title":"Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities","volume":"38","author":"Heinz","year":"2010","journal-title":"Mol. Cell"},{"key":"2023061310574045400_btab203-B7","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1016\/j.cell.2012.12.009","article-title":"DNA-binding specificities of human transcription factors","volume":"152","author":"Jolma","year":"2013","journal-title":"Cell"},{"key":"2023061310574045400_btab203-B8","doi-asserted-by":"crossref","first-page":"R12","DOI":"10.1186\/gb-2004-5-2-r12","article-title":"Versatile and open software for comparing large genomes","volume":"5","author":"Kurtz","year":"2004","journal-title":"Genome Biol"},{"key":"2023061310574045400_btab203-B9","first-page":"4632","article-title":"ProSampler: an ultrafast and accurate motif finder in large ChIP-seq datasets for combinatory motif discovery","volume":"35","author":"Li","year":"2019","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023061310574045400_btab203-B10","doi-asserted-by":"crossref","first-page":"1696","DOI":"10.1093\/bioinformatics\/btr189","article-title":"MEME-ChIP: motif analysis of large DNA datasets","volume":"27","author":"Machanick","year":"2011","journal-title":"Bioinformatics"},{"key":"2023061310574045400_btab203-B11","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1145\/321941.321946","article-title":"A space-economical suffix tree construction algorithm","volume":"23","author":"McCreight","year":"1976","journal-title":"J. ACM"},{"key":"2023061310574045400_btab203-B12","doi-asserted-by":"crossref","first-page":"i311","DOI":"10.1093\/bioinformatics\/bti1044","article-title":"Computing the P-value of the information content from an alignment of multiple sequences","volume":"21","author":"Nagarajan","year":"2005","journal-title":"Bioinformatics"},{"key":"2023061310574045400_btab203-B13","doi-asserted-by":"crossref","first-page":"W199","DOI":"10.1093\/nar\/gkh465","article-title":"Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes","volume":"32","author":"Pavesi","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023061310574045400_btab203-B14","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1038\/nature12311","article-title":"A compendium of RNA-binding motifs for decoding gene regulation","volume":"499","author":"Ray","year":"2013","journal-title":"Nature"},{"key":"2023061310574045400_btab203-B15","doi-asserted-by":"crossref","first-page":"e126","DOI":"10.1093\/nar\/gkr574","article-title":"STEME: efficient EM to find motifs in large data sets","volume":"39","author":"Reid","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023061310574045400_btab203-B16","doi-asserted-by":"crossref","first-page":"6097","DOI":"10.1093\/nar\/18.20.6097","article-title":"Sequence logos: a new way to display consensus sequences","volume":"18","author":"Schneider","year":"1990","journal-title":"Nucleic Acids Res"},{"key":"2023061310574045400_btab203-B17","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1093\/bioinformatics\/16.1.16","article-title":"DNA binding sites: representation and discovery","volume":"16","author":"Stormo","year":"2000","journal-title":"Bioinformatics"},{"key":"2023061310574045400_btab203-B18","doi-asserted-by":"crossref","first-page":"W86","DOI":"10.1093\/nar\/gkr377","article-title":"RSAT 2011: regulatory sequence analysis tools","volume":"39","author":"Thomas-Chollier","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023061310574045400_btab203-B19","doi-asserted-by":"crossref","first-page":"508","DOI":"10.1038\/nmeth.3810","article-title":"Robust transcriptome-wide discovery of RNA-binding protein binding sites with enhanced CLIP (eCLIP)","volume":"13","author":"Van Nostrand","year":"2016","journal-title":"Nat. Methods"},{"key":"2023061310574045400_btab203-B20","first-page":"1","author":"Weiner","year":"1973"},{"key":"2023061310574045400_btab203-B21","first-page":"1","article-title":"Probability plotting methods for the analysis of data","volume":"55","author":"Wilk","year":"1968","journal-title":"Biometrika"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab203\/37735449\/btab203.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/18\/2834\/50579626\/btab203.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/18\/2834\/50579626\/btab203.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,13]],"date-time":"2023-06-13T06:59:52Z","timestamp":1686639592000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/18\/2834\/6184861"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,3,24]]},"references-count":21,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2021,9,29]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab203","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.11.23.394619","asserted-by":"object"}],"has-review":[{"id-type":"doi","id":"10.3410\/f.739811364.793589080","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,9,15]]},"published":{"date-parts":[[2021,3,24]]}}}