{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T03:22:28Z","timestamp":1761708148662},"reference-count":27,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":3242,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Conserved motifs often represent biological significance, providing insight on biological aspects such as gene transcription regulation, biomolecular secondary structure, presence of non-coding RNAs and evolution history. With the increasing number of sequenced genomic data, faster and more accurate tools are needed to automate the process of motif discovery.<\/jats:p>\n               <jats:p>Results: We propose a deterministic sequential Monte Carlo (DSMC) motif discovery technique based on the position weight matrix (PWM) model to locate conserved motifs in a given set of nucleotide sequences, and extend our model to search for instances of the motif with insertions\/deletions. We show that the proposed method can be used to align the motif where there are insertions and deletions found in different instances of the motif, which cannot be satisfactorily done using other multiple alignment and motif discovery algorithms.<\/jats:p>\n               <jats:p>Availability: MATLAB code is available at http:\/\/www.ee.columbia.edu\/~kcliang<\/jats:p>\n               <jats:p>Contact: \u00a0xw2008@columbia.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm543","type":"journal-article","created":{"date-parts":[[2007,11,18]],"date-time":"2007-11-18T01:26:02Z","timestamp":1195349162000},"page":"46-55","source":"Crossref","is-referenced-by-count":19,"title":["A profile-based deterministic sequential Monte Carlo algorithm for motif discovery"],"prefix":"10.1093","volume":"24","author":[{"given":"Kuo-Ching","family":"Liang","sequence":"first","affiliation":[{"name":"Columbia University, Department of Electrical Engineering, New York, NY 10025, USA"}]},{"given":"Xiaodong","family":"Wang","sequence":"additional","affiliation":[{"name":"Columbia University, Department of Electrical Engineering, New York, NY 10025, USA"}]},{"given":"Dimitris","family":"Anastassiou","sequence":"additional","affiliation":[{"name":"Columbia University, Department of Electrical Engineering, New York, NY 10025, USA"}]}],"member":"286","published-online":{"date-parts":[[2007,11,17]]},"reference":[{"key":"2023020209450883600_B1","doi-asserted-by":"crossref","first-page":"R2","DOI":"10.1186\/gb-2006-7-1-r2","article-title":"Variable window binding for mutually exclusive alternative binding","volume":"7","author":"Anastassiou","year":"2006","journal-title":"Genome Biol"},{"key":"2023020209450883600_B2","article-title":"Unsupervised learning of multiple motifs in biopolymers using expectation maximization","author":"Bailey","year":"1993","journal-title":"Technical Report"},{"key":"2023020209450883600_B3","first-page":"28","article-title":"Fitting a mixture model by expectation maximization to discover motifs in biopolymers","volume-title":"In Proceedings of the 2nd Int'l Conference on Intelligent Systems for Molecular Biology.","author":"Bailey","year":"1994"},{"key":"2023020209450883600_B4","doi-asserted-by":"crossref","first-page":"4442","DOI":"10.1093\/nar\/gkf578","article-title":"Additivity in proteinDNA interactions: how good an approximation is it?","volume":"30","author":"Benos","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023020209450883600_B5","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1089\/10665270252935430","article-title":"Finding motifs using random projections","volume":"9","author":"Buhler","year":"2002","journal-title":"J. Comput. Biol"},{"key":"2023020209450883600_B6","doi-asserted-by":"crossref","first-page":"1255","DOI":"10.1093\/nar\/30.5.1255","article-title":"Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors","volume":"30","author":"Bulyk","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023020209450883600_B7","volume-title":"Statistical Distributions.","author":"Evans","year":"2002","edition":"3rd"},{"key":"2023020209450883600_B8","article-title":"Sequential Monte Carlo methods in filter theory","volume-title":"Ph.D. Dissertation.","author":"Fearnhead","year":"1998"},{"key":"2023020209450883600_B9","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1023\/B:STCO.0000009418.04621.cd","article-title":"Particle filters for mixture models with an unknown number of components","volume":"14","author":"Fearnhead","year":"2004","journal-title":"J. Stat. Comput"},{"key":"2023020209450883600_B10","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1016\/j.cell.2005.07.028","article-title":"Mutually exclusive splicing of the insect Dscam pre-mRNA directed by competing intronic RNA secondary structures","volume":"123","author":"Graveley","year":"2005","journal-title":"Cell"},{"key":"2023020209450883600_B11","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1093\/bioinformatics\/15.7.563","article-title":"Indentifying DNA and protein patterns with statistically significant alignment of multiple sequences","volume":"15","author":"Hertz","year":"1999","journal-title":"Bioinformatics"},{"key":"2023020209450883600_B12","doi-asserted-by":"crossref","first-page":"1205","DOI":"10.1006\/jmbi.2000.3519","article-title":"Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae","volume":"296","author":"Hughes","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023020209450883600_B13","doi-asserted-by":"crossref","first-page":"1557","DOI":"10.1093\/bioinformatics\/bth127","article-title":"BioOptimizer: a Bayesian scoring function approach to motif discovery","volume":"20","author":"Jensen","year":"2004","journal-title":"Bioinformatics"},{"key":"2023020209450883600_B14","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1214\/088342304000000107","article-title":"Computational discovery of gene regulatory binding motifs: a Bayesian perspective","volume":"19","author":"Jensen","year":"2004","journal-title":"Stat. Sci"},{"key":"2023020209450883600_B15","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1093\/bioinformatics\/14.10.846","article-title":"Hidden Markov Models for detecting remote protein homologies","volume":"14","author":"Karplus","year":"1998","journal-title":"Bioinformatics"},{"key":"2023020209450883600_B16","doi-asserted-by":"crossref","first-page":"1501","DOI":"10.1006\/jmbi.1994.1104","article-title":"Hidden Markov models in computational biology: applications to protein modeling","volume":"235","author":"Krogh","year":"1994","journal-title":"J. Mol. Biol"},{"key":"2023020209450883600_B17","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1126\/science.8211139","article-title":"Detecting subtle signals: a Gibbs sampling strategy for multiple alignment","volume":"262","author":"Lawrence","year":"1993","journal-title":"Science"},{"key":"2023020209450883600_B18","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1002\/prot.340070105","article-title":"An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences","volume":"7","author":"Lawrence","year":"1990","journal-title":"Proteins Struct. Funct. Genet"},{"key":"2023020209450883600_B19","article-title":"Statistical models for biological sequence motif discovery","volume-title":"Case Studies in Bayesian Statistics VI.","author":"Liu","year":"2004"},{"key":"2023020209450883600_B20","article-title":"Bioprospector: discover conserved DNA motifs in upstream regulatory regions of co-expressed genes","author":"Liu","year":"2001"},{"key":"2023020209450883600_B21","doi-asserted-by":"crossref","first-page":"5373","DOI":"10.1128\/JB.181.17.5373-5383.1999","article-title":"Regulation ofmgatranscription in the Group A Streptococcus: specific binding of Mga within its own promoter and evidence for a negative regulator","volume":"7","author":"McIver","year":"1999","journal-title":"J. Bacteriol"},{"key":"2023020209450883600_B22","first-page":"269","article-title":"Combinatorial approaches to finding subtle signals in DNA sequences","volume-title":"In Proceedings of the 8th Int'l Conferences on Intelligent Systems for Molecular Biology.","author":"Pevzner","year":"2000"},{"key":"2023020209450883600_B23","article-title":"Sequential Monte Carlo methods for digital communications","volume-title":"Ph.D. dissertation.","author":"Punskaya","year":"2003"},{"key":"2023020209450883600_B24","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1109\/TCBB.2004.14","article-title":"A uniform projection method for motif discovery in DNA sequences","volume":"1","author":"Raphael","year":"2004","journal-title":"IEEE Trans. Comput. Biol. Bioinform"},{"key":"2023020209450883600_B25","doi-asserted-by":"crossref","first-page":"939","DOI":"10.1038\/nbt1098-939","article-title":"Finding DNA regulatory motifs within unaligned non-coding sequences clustered by whole-genome mRNA quantitation","volume":"10","author":"Roth","year":"1998","journal-title":"Nat. Biotechnol"},{"key":"2023020209450883600_B26","doi-asserted-by":"crossref","first-page":"1183","DOI":"10.1073\/pnas.86.4.1183","article-title":"Identifying protein-binding sites from unaligned DNA fragments","volume":"86","author":"Stormo","year":"1989","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020209450883600_B27","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1038\/nbt1053","article-title":"Assessing computational tools for the discovery of transcription factor binding sites","volume":"23","author":"Tompa","year":"2005","journal-title":"Nat. Biotechnol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/1\/46\/49043876\/bioinformatics_24_1_46.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/1\/46\/49043876\/bioinformatics_24_1_46.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T10:09:17Z","timestamp":1675332557000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/1\/46\/205750"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,11,17]]},"references-count":27,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm543","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,1,1]]},"published":{"date-parts":[[2007,11,17]]}}}