{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,9,16]],"date-time":"2023-09-16T19:31:24Z","timestamp":1694892684284},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"7","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Generating accurate transcription factor (TF) binding site motifs from data generated using the next-generation sequencing, especially ChIP-seq, is challenging. The challenge arises because a typical experiment reports a large number of sequences bound by a TF, and the length of each sequence is relatively long. Most traditional motif finders are slow in handling such enormous amount of data. To overcome this limitation, tools have been developed that compromise accuracy with speed by using heuristic discrete search strategies or limited optimization of identified seed motifs. However, such strategies may not fully use the information in input sequences to generate motifs. Such motifs often form good seeds and can be further improved with appropriate scoring functions and rapid optimization.<\/jats:p>\n               <jats:p>Results: We report a tool named discriminative motif optimizer (DiMO). DiMO takes a seed motif along with a positive and a negative database and improves the motif based on a discriminative strategy. We use area under receiver-operating characteristic curve (AUC) as a measure of discriminating power of motifs and a strategy based on perceptron training that maximizes AUC rapidly in a discriminative manner. Using DiMO, on a large test set of 87 TFs from human, drosophila and yeast, we show that it is possible to significantly improve motifs identified by nine motif finders. The motifs are generated\/optimized using training sets and evaluated on test sets. The AUC is improved for almost 90% of the TFs on test sets and the magnitude of increase is up to 39%.<\/jats:p>\n               <jats:p>Availability and implementation: \u00a0DiMO is available at http:\/\/stormo.wustl.edu\/DiMO<\/jats:p>\n               <jats:p>Contact: \u00a0rpatel@genetics.wustl.edu, ronakypatel@gmail.com<\/jats:p>","DOI":"10.1093\/bioinformatics\/btt748","type":"journal-article","created":{"date-parts":[[2013,12,26]],"date-time":"2013-12-26T01:19:58Z","timestamp":1388020798000},"page":"941-948","source":"Crossref","is-referenced-by-count":19,"title":["Discriminative motif optimization based on perceptron training"],"prefix":"10.1093","volume":"30","author":[{"given":"Ronak Y.","family":"Patel","sequence":"first","affiliation":[{"name":"Department of Genetics, Washington University School of Medicine, St. Louis, MO 63108, USA"}]},{"given":"Gary D.","family":"Stormo","sequence":"additional","affiliation":[{"name":"Department of Genetics, Washington University School of Medicine, St. Louis, MO 63108, USA"}]}],"member":"286","published-online":{"date-parts":[[2013,12,24]]},"reference":[{"key":"2023012710453442200_btt748-B1","doi-asserted-by":"crossref","first-page":"1653","DOI":"10.1093\/bioinformatics\/btr261","article-title":"DREME: motif discovery in transcription factor ChIP-seq data","volume":"27","author":"Bailey","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012710453442200_btt748-B2","doi-asserted-by":"crossref","first-page":"e128","DOI":"10.1093\/nar\/gks433","article-title":"Inferring direct DNA binding from ChIP-seq","volume":"40","author":"Bailey","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012710453442200_btt748-B3","doi-asserted-by":"crossref","first-page":"508","DOI":"10.1186\/cc3000","article-title":"Statistics review 13: receiver operating characteristic curves","volume":"8","author":"Bewick","year":"2004","journal-title":"Crit. Care"},{"key":"2023012710453442200_btt748-B4","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1186\/1471-2105-10-388","article-title":"DISPARE: DIScriminative PAttern REfinement for position weight matrices","volume":"10","author":"da Piedade","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012710453442200_btt748-B5","doi-asserted-by":"crossref","first-page":"e40373","DOI":"10.1371\/journal.pone.0040373","article-title":"POWRS: position-sensitive motif discovery","volume":"7","author":"Davis","year":"2012","journal-title":"PLoS One"},{"key":"2023012710453442200_btt748-B6","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1016\/j.molcel.2007.09.027","article-title":"A universal framework for regulatory element discovery across all genomes and data types","volume":"28","author":"Elemento","year":"2007","journal-title":"Mol. Cell"},{"key":"2023012710453442200_btt748-B7","doi-asserted-by":"crossref","first-page":"2303","DOI":"10.1093\/bioinformatics\/btn444","article-title":"Seeder:discriminative seeding DNA motif discovery","volume":"24","author":"Fauteux","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012710453442200_btt748-B8","doi-asserted-by":"crossref","first-page":"i321","DOI":"10.1093\/bioinformatics\/btp230","article-title":"DISCOVER: a feature-based discriminative method for motif search in complex genomes","volume":"25","author":"Fu","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012710453442200_btt748-B9","doi-asserted-by":"crossref","first-page":"840","DOI":"10.1038\/nrg3306","article-title":"ChIP-seq and beyond: new and improved methodologies to detect and characterize protein-DNA interactions","volume":"13","author":"Furey","year":"2012","journal-title":"Nat. Rev. Genet."},{"key":"2023012710453442200_btt748-B10","doi-asserted-by":"crossref","first-page":"818","DOI":"10.1111\/j.1553-2712.1997.tb03793.x","article-title":"Statistical methodology: III. Receiver operating characteristic (ROC) curves","volume":"4","author":"Grzybowski","year":"1997","journal-title":"Acad. Emerg. Med."},{"key":"2023012710453442200_btt748-B11","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1101\/gr.139881.112","article-title":"P-value-based regulatory motif discovery using positional weight matrices","volume":"23","author":"Hartmann","year":"2013","journal-title":"Genome Res."},{"key":"2023012710453442200_btt748-B12","doi-asserted-by":"crossref","first-page":"2361","DOI":"10.1093\/bioinformatics\/btr412","article-title":"DECOD: fast and accurate discriminative DNA motif finding","volume":"27","author":"Huggins","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012710453442200_btt748-B13","doi-asserted-by":"crossref","first-page":"2622","DOI":"10.1093\/bioinformatics\/btq488","article-title":"Deep and wide digging for binding motifs in ChIP-Seq data","volume":"26","author":"Kulakovskiy","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012710453442200_btt748-B14","doi-asserted-by":"crossref","first-page":"2217","DOI":"10.1093\/bioinformatics\/btl371","article-title":"Finding motifs from all sequences with and without binding sites","volume":"22","author":"Leung","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012710453442200_btt748-B15","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.1093\/bioinformatics\/btm080","article-title":"GAPWM: a genetic algorithm method for optimizing a position weight matrix","volume":"23","author":"Li","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012710453442200_btt748-B16","doi-asserted-by":"crossref","first-page":"1180","DOI":"10.1101\/gr.076117.108","article-title":"Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets","volume":"18","author":"Linhart","year":"2008","journal-title":"Genome Res."},{"key":"2023012710453442200_btt748-B17","doi-asserted-by":"crossref","first-page":"835","DOI":"10.1038\/nbt717","article-title":"An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments","volume":"20","author":"Liu","year":"2002","journal-title":"Nat. Biotechnol."},{"key":"2023012710453442200_btt748-B18","doi-asserted-by":"crossref","first-page":"2826","DOI":"10.1093\/bioinformatics\/btq546","article-title":"Identification of context-dependent motifs by contrasting ChIP binding data","volume":"26","author":"Mason","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012710453442200_btt748-B19","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1186\/1471-2105-8-385","article-title":"Discriminative motif discovery in DNA and protein sequences using the DEME algorithm","volume":"8","author":"Redhead","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023012710453442200_btt748-B20","volume-title":"Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms","author":"Rosenblatt","year":"1962"},{"key":"2023012710453442200_btt748-B21","doi-asserted-by":"crossref","first-page":"i387","DOI":"10.1093\/bioinformatics\/bti1002","article-title":"A motif-based framework for recognizing sequence families","volume":"21","author":"Sharan","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012710453442200_btt748-B22","doi-asserted-by":"crossref","first-page":"e24576","DOI":"10.1371\/journal.pone.0024576","article-title":"AMD, an automated motif discovery tool using stepwise refinement of gapped consensuses","volume":"6","author":"Shi","year":"2011","journal-title":"PLoS One"},{"key":"2023012710453442200_btt748-B23","doi-asserted-by":"crossref","first-page":"e1000156","DOI":"10.1371\/journal.pcbi.1000156","article-title":"PhyloGibbs-MP: module prediction and discriminative motif-finding by Gibbs sampling","volume":"4","author":"Siddharthan","year":"2008","journal-title":"PLoS Comput. Biol."},{"key":"2023012710453442200_btt748-B24","doi-asserted-by":"crossref","first-page":"599","DOI":"10.1089\/10665270360688219","article-title":"Discriminative motifs","volume":"10","author":"Sinha","year":"2003","journal-title":"J. Comput. Biol."},{"key":"2023012710453442200_btt748-B25","doi-asserted-by":"crossref","first-page":"e454","DOI":"10.1093\/bioinformatics\/btl227","article-title":"On counting position weight matrix matches in a sequence, with application to discriminative motif finding","volume":"22","author":"Sinha","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012710453442200_btt748-B26","doi-asserted-by":"crossref","first-page":"1560","DOI":"10.1073\/pnas.0406123102","article-title":"Identifying tissue-selective transcription factor binding sites in vertebrate promoters","volume":"102","author":"Smith","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012710453442200_btt748-B27","doi-asserted-by":"crossref","first-page":"2997","DOI":"10.1093\/nar\/10.9.2997","article-title":"Use of the \u2018Perceptron' algorithm to distinguish translational initiation sites in E. coli","volume":"10","author":"Stormo","year":"1982","journal-title":"Nucleic Acids Res."},{"key":"2023012710453442200_btt748-B28","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1038\/nrg2845","article-title":"Determining the specificity of protein-DNA interactions","volume":"11","author":"Stormo","year":"2010","journal-title":"Nat. Rev. Genet."},{"key":"2023012710453442200_btt748-B29","doi-asserted-by":"crossref","first-page":"e31","DOI":"10.1093\/nar\/gkr1104","article-title":"RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets","volume":"40","author":"Thomas-Chollier","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012710453442200_btt748-B30","doi-asserted-by":"crossref","first-page":"1551","DOI":"10.1038\/nprot.2012.088","article-title":"A complete workflow for the analysis of full-size ChIP-seq (and similar) data sets using peak-motifs","volume":"7","author":"Thomas-Chollier","year":"2012","journal-title":"Nat. Protoc."},{"key":"2023012710453442200_btt748-B31","doi-asserted-by":"crossref","first-page":"W412","DOI":"10.1093\/nar\/gki492","article-title":"WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar","volume":"33","author":"Wang","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012710453442200_btt748-B32","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1038\/nbt.2486","article-title":"Evaluation of methods for modeling transcription factor sequence specificity","volume":"31","author":"Weirauch","year":"2013","journal-title":"Nat. Biotechnol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/7\/941\/48921403\/bioinformatics_30_7_941.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/7\/941\/48921403\/bioinformatics_30_7_941.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T11:17:22Z","timestamp":1674818242000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/7\/941\/236282"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,12,24]]},"references-count":32,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2014,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btt748","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,4,1]]},"published":{"date-parts":[[2013,12,24]]}}}