{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T08:25:36Z","timestamp":1758270336020},"reference-count":15,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1795,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: How to find motifs from genome-scale functional sequences, such as all the promoters in a genome, is a challenging problem. Word-based methods count the occurrences of oligomers to detect excessively represented ones. This approach is known to be fast and accurate compared with other methods. However, two problems have hampered the application of such methods to large-scale data. One is the computational cost necessary for clustering similar oligomers, and the other is the bias in the frequency of fixed-length oligomers, which complicates the detection of significant words.<\/jats:p>\n               <jats:p>Results: We introduce a method that uses a DNA Gray code and equiprobable oligomers, which solve the clustering problem and the oligomer bias, respectively. Our method can analyze 18 000 sequences of ~1 kbp long in 30 s. We also show that the accuracy of our method is superior to that of a leading method, especially for large-scale data and small fractions of motif-containing sequences.<\/jats:p>\n               <jats:p>Availability: The online and stand-alone versions of the application, named Hegma, are available at our website: http:\/\/www.genome.ist.i.kyoto-u.ac.jp\/~ichinose\/hegma\/<\/jats:p>\n               <jats:p>Contact: \u00a0ichinose@i.kyoto-u.ac.jp; o.gotoh@i.kyoto-u.ac.jp<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr606","type":"journal-article","created":{"date-parts":[[2011,11,5]],"date-time":"2011-11-05T04:36:36Z","timestamp":1320467796000},"page":"25-31","source":"Crossref","is-referenced-by-count":12,"title":["Large-scale motif discovery using DNA Gray code and equiprobable oligomers"],"prefix":"10.1093","volume":"28","author":[{"given":"Natsuhiro","family":"Ichinose","sequence":"first","affiliation":[{"name":"Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501, Japan"}]},{"given":"Tetsushi","family":"Yada","sequence":"additional","affiliation":[{"name":"Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501, Japan"}]},{"given":"Osamu","family":"Gotoh","sequence":"additional","affiliation":[{"name":"Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501, Japan"}]}],"member":"286","published-online":{"date-parts":[[2011,11,3]]},"reference":[{"key":"2023061011453235200_B1","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-59281-2","volume-title":"Chaos. an Introduction to Dynamical Systems.","author":"Alligood","year":"1997"},{"key":"2023061011453235200_B2","first-page":"28","article-title":"Fitting a mixture model by expectation maximization to discover motifs in biopolymers","volume-title":"Proceedings of the 2nd International Conference on Intelligent Systems for Molecular Biology.","author":"Bailey","year":"1994"},{"issue":"Suppl. 7","key":"2023061011453235200_B3","doi-asserted-by":"crossref","first-page":"S21","DOI":"10.1186\/1471-2105-8-S7-S21","article-title":"A survey of DNA motif finding algorithms","volume":"8","author":"Das","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023061011453235200_B4","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1109\/TC.1984.5009360","article-title":"On generating the N-ary reflected gray codes","volume":"C-33","author":"Er","year":"1984","journal-title":"IEEE Trans. Comp."},{"key":"2023061011453235200_B5","author":"Gray","year":"1947","journal-title":"Pulse code communication. U.S. Patent 2632058."},{"key":"2023061011453235200_B6","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1126\/science.8211139","article-title":"Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment","volume":"262","author":"Lawrence","year":"1993","journal-title":"Science"},{"key":"2023061011453235200_B7","doi-asserted-by":"crossref","first-page":"W199","DOI":"10.1093\/nar\/gkh465","article-title":"Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes","volume":"32","author":"Pavesi","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023061011453235200_B8","doi-asserted-by":"crossref","first-page":"D68","DOI":"10.1093\/nar\/gkj075","article-title":"cisRED: a database system for genome scale computational discovery of regulatory elements","volume":"34","author":"Robertson","year":"2006","journal-title":"Nucleic Acids Res."},{"issue":"Suppl. 1","key":"2023061011453235200_B9","doi-asserted-by":"crossref","first-page":"D91","DOI":"10.1093\/nar\/gkh012","article-title":"JASPAR: an open-access database for eukaryotic transcription factor binding profiles","volume":"32","author":"Sandelin","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023061011453235200_B10","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/1745-6150-1-11","article-title":"A survey of motif discovery methods in an integrated framework","volume":"1","author":"Sandve","year":"2006","journal-title":"Biol. Direct"},{"key":"2023061011453235200_B11","doi-asserted-by":"crossref","first-page":"6097","DOI":"10.1093\/nar\/18.20.6097","article-title":"Sequence logos: a new way to display consensus sequences","volume":"18","author":"Schneider","year":"1990","journal-title":"Nucleic Acids Res."},{"key":"2023061011453235200_B12","doi-asserted-by":"crossref","first-page":"1711","DOI":"10.1101\/gr.2435604","article-title":"Sequence comparison of human and mouse genes reveals a homologous block structure in the promoter regions","volume":"14","author":"Suzuki","year":"2004","journal-title":"Genome Res."},{"key":"2023061011453235200_B13","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1038\/nbt1053","article-title":"Assessing computational tools for the discovery of transcription factor binding sites","volume":"23","author":"Tompa","year":"2005","journal-title":"Nat. Biotechnol."},{"issue":"Suppl. 1","key":"2023061011453235200_B14","first-page":"D97","article-title":"DBTSS: database of transcription start sites, progress report 2008","volume":"36","author":"Wakaguri","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023061011453235200_B15","first-page":"55","article-title":"TRANSFAC, TRANSPATH and CYTOMER as starting points for an ontology of regulatory networks","volume":"4","author":"Wingender","year":"2004","journal-title":"In Silico Biol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/1\/25\/50568456\/bioinformatics_28_1_25.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/1\/25\/50568456\/bioinformatics_28_1_25.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,10]],"date-time":"2023-06-10T11:46:52Z","timestamp":1686397612000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/1\/25\/219964"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,11,3]]},"references-count":15,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2012,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr606","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,1,1]]},"published":{"date-parts":[[2011,11,3]]}}}