{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,22]],"date-time":"2025-02-22T00:45:14Z","timestamp":1740185114575,"version":"3.37.3"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"1","funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2013,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Local motifs are patterns of DNA or protein sequences that occur within a sequence interval relative to a biologically defined anchor or landmark. Current protein motif discovery methods do not adequately consider such constraints to identify biologically significant motifs that are only weakly over-represented but spatially confined. Using negatives, i.e. sequences known to not contain a local motif, can further increase the specificity of their discovery.<\/jats:p>\n               <jats:p>Results: This article introduces the method DLocalMotif that makes use of positional information and negative data for local motif discovery in protein sequences. DLocalMotif combines three scoring functions, measuring degrees of motif over-representation, entropy and spatial confinement, specifically designed to discriminatively exploit the availability of negative data. The method is shown to outperform current methods that use only a subset of these motif characteristics. We apply the method to several biological datasets. The analysis of peroxisomal targeting signals uncovers several novel motifs that occur immediately upstream of the dominant peroxisomal targeting signal-1 signal. The analysis of proline-tyrosine nuclear localization signals uncovers multiple novel motifs that overlap with C2H2 zinc finger domains. We also evaluate the method on classical nuclear localization signals and endoplasmic reticulum retention signals and find that DLocalMotif successfully recovers biologically relevant sequence properties.<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/bioinf.scmb.uq.edu.au\/dlocalmotif\/<\/jats:p>\n               <jats:p>Contact: \u00a0m.boden@uq.edu.au<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts654","type":"journal-article","created":{"date-parts":[[2012,11,10]],"date-time":"2012-11-10T04:30:01Z","timestamp":1352521801000},"page":"39-46","source":"Crossref","is-referenced-by-count":13,"title":["DLocalMotif: a discriminative approach for discovering local motifs in protein sequences"],"prefix":"10.1093","volume":"29","author":[{"given":"Ahmed M.","family":"Mehdi","sequence":"first","affiliation":[{"name":"1 Institute for Molecular Bioscience, The University of Queensland, Australia, 2Microsoft corporation, USA, 3School of Chemistry and Molecular Biosciences, The University of Queensland, Australia and 4Infectious Diseases Research Centre, The University of Queensland, Australia"}]},{"given":"Muhammad Shoaib B.","family":"Sehgal","sequence":"additional","affiliation":[{"name":"1 Institute for Molecular Bioscience, The University of Queensland, Australia, 2Microsoft corporation, USA, 3School of Chemistry and Molecular Biosciences, The University of Queensland, Australia and 4Infectious Diseases Research Centre, The University of Queensland, Australia"}]},{"given":"Bostjan","family":"Kobe","sequence":"additional","affiliation":[{"name":"1 Institute for Molecular Bioscience, The University of Queensland, Australia, 2Microsoft corporation, USA, 3School of Chemistry and Molecular Biosciences, The University of Queensland, Australia and 4Infectious Diseases Research Centre, The University of Queensland, Australia"},{"name":"1 Institute for Molecular Bioscience, The University of Queensland, Australia, 2Microsoft corporation, USA, 3School of Chemistry and Molecular Biosciences, The University of Queensland, Australia and 4Infectious Diseases Research Centre, The University of Queensland, Australia"},{"name":"1 Institute for Molecular Bioscience, The University of Queensland, Australia, 2Microsoft corporation, USA, 3School of Chemistry and Molecular Biosciences, The University of Queensland, Australia and 4Infectious Diseases Research Centre, The University of Queensland, Australia"}]},{"given":"Timothy L.","family":"Bailey","sequence":"additional","affiliation":[{"name":"1 Institute for Molecular Bioscience, The University of Queensland, Australia, 2Microsoft corporation, USA, 3School of Chemistry and Molecular Biosciences, The University of Queensland, Australia and 4Infectious Diseases Research Centre, The University of Queensland, Australia"}]},{"given":"Mikael","family":"Bod\u00e9n","sequence":"additional","affiliation":[{"name":"1 Institute for Molecular Bioscience, The University of Queensland, Australia, 2Microsoft corporation, USA, 3School of Chemistry and Molecular Biosciences, The University of Queensland, Australia and 4Infectious Diseases Research Centre, The University of Queensland, Australia"},{"name":"1 Institute for Molecular Bioscience, The University of Queensland, Australia, 2Microsoft corporation, USA, 3School of Chemistry and Molecular Biosciences, The University of Queensland, Australia and 4Infectious Diseases Research Centre, The University of Queensland, Australia"}]}],"member":"286","published-online":{"date-parts":[[2012,11,9]]},"reference":[{"key":"2023020303304371200_bts654-B1","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1186\/1471-2164-8-191","article-title":"C-terminal motif prediction in eukaryotic proteomes using comparative genomics and statistical over-representation across protein families","volume":"8","author":"Austin","year":"2007","journal-title":"BMC Genomics"},{"key":"2023020303304371200_bts654-B2","doi-asserted-by":"crossref","first-page":"W202","DOI":"10.1093\/nar\/gkp335","article-title":"MEME suite: tools for motif discovery and searching","volume":"37","author":"Bailey","year":"2009","journal-title":"Nucleic Acids Res."},{"volume-title":"Statistics for Technology: a Course in Applied Statistics. 3rd edn. Chapman and Hall, London\/New York, 1983","year":"1989","author":"Chatfield","key":"2023020303304371200_bts654-B3"},{"key":"2023020303304371200_bts654-B4","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.1101\/gr.849004","article-title":"Weblogo: a sequence logo generator","volume":"14","author":"Crooks","year":"2004","journal-title":"Genome Res."},{"key":"2023020303304371200_bts654-B5","doi-asserted-by":"crossref","first-page":"478","DOI":"10.1016\/0968-0004(91)90184-W","article-title":"Nuclear targeting sequences\u2013a consensus?","volume":"16","author":"Dingwall","year":"1991","journal-title":"Trends Biochem. Sci."},{"key":"2023020303304371200_bts654-B6","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1186\/1471-2105-9-19","article-title":"NestedMICA as an ab initio protein motif discovery tool","volume":"9","author":"Dogruel","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023020303304371200_bts654-B7","doi-asserted-by":"crossref","first-page":"1043","DOI":"10.1091\/mbc.7.7.1043","article-title":"Genes that control the fidelity of endoplasmic reticulum to golgi transport identified as suppressors of vesicle budding mutations","volume":"7","author":"Elrod-Erickson","year":"1996","journal-title":"Mol. Biol. Cell."},{"key":"2023020303304371200_bts654-B8","first-page":"1429","article-title":"Early stage monitoring of miltefosine induced apoptosis in KB cells by multinuclear NMR spectroscopy","volume":"16","author":"Engelmann","year":"1996","journal-title":"Anticancer Res."},{"key":"2023020303304371200_bts654-B9","doi-asserted-by":"crossref","first-page":"1249","DOI":"10.1128\/JB.01267-09","article-title":"The apparent malate synthase activity of rhodobacter sphaeroides is due to two paralogous enzymes, (3s)-malyl-coenzyme a (coa)\/beta-methylmalyl-coa lyase and (3s)- malyl-coa thioesterase","volume":"192","author":"Erb","year":"2010","journal-title":"J. Bacteriol."},{"key":"2023020303304371200_bts654-B10","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1038\/nmeth1061","article-title":"Trawler: de novo regulatory motif discovery pipeline for chromatin immunoprecipitation","volume":"4","author":"Ettwiller","year":"2007","journal-title":"Nat. Methods"},{"key":"2023020303304371200_bts654-B11","doi-asserted-by":"crossref","first-page":"R15.1","DOI":"10.1186\/gb-2008-9-1-r15","article-title":"Towards defining the nuclear proteome","volume":"9","author":"Fink","year":"2008","journal-title":"Genome Biol."},{"key":"2023020303304371200_bts654-B12","doi-asserted-by":"crossref","first-page":"D211","DOI":"10.1093\/nar\/gkp985","article-title":"The Pfam protein families database","volume":"38","author":"Finn","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023020303304371200_bts654-B13","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1186\/1471-2105-5-127","article-title":"Functionally specified protein signatures distinctive for each of the different blue copper proteins","volume":"5","author":"Giri","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023020303304371200_bts654-B14","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1038\/nature02800","article-title":"Transcriptional regulatory code of a eukaryotic genome","volume":"431","author":"Harbison","year":"2004","journal-title":"Nature"},{"key":"2023020303304371200_bts654-B15","doi-asserted-by":"crossref","first-page":"606","DOI":"10.1002\/prot.21420","article-title":"Identifying novel peroxisomal proteins","volume":"69","author":"Hawkins","year":"2007","journal-title":"Proteins"},{"key":"2023020303304371200_bts654-B16","doi-asserted-by":"crossref","first-page":"680","DOI":"10.1093\/bioinformatics\/btq003","article-title":"CD-HIT suite: a web server for clustering and comparing biological sequences","volume":"26","author":"Huang","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020303304371200_bts654-B17","doi-asserted-by":"crossref","first-page":"e1001070","DOI":"10.1371\/journal.pcbi.1001070","article-title":"De-novo discovery of differentially abundant transcription factor binding sites including their positional preference","volume":"7","author":"Keilwagen","year":"2011","journal-title":"PLoS Comput. Biol."},{"key":"2023020303304371200_bts654-B18","doi-asserted-by":"crossref","first-page":"478","DOI":"10.1074\/jbc.M807017200","article-title":"Six classes of nuclear localization signals specific to different binding grooves of importin \u03b1","volume":"284","author":"Kosugi","year":"2009","journal-title":"J. Biol. Chem."},{"key":"2023020303304371200_bts654-B19","doi-asserted-by":"crossref","first-page":"543","DOI":"10.1016\/j.cell.2006.05.049","article-title":"Rules for nuclear localization sequence recognition by karyopherin beta 2","volume":"126","author":"Lee","year":"2006","journal-title":"Cell"},{"key":"2023020303304371200_bts654-B20","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1089\/104454900314492","article-title":"Characterization of a zinc finger protein ZAN75: nuclear localization signal, transcriptional activator activity, and expression during neuronal differentiation of P19 cells","volume":"19","author":"Lee","year":"2000","journal-title":"DNA Cell Biol."},{"key":"2023020303304371200_bts654-B21","doi-asserted-by":"crossref","first-page":"1180","DOI":"10.1101\/gr.076117.108","article-title":"Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets","volume":"18","author":"Linhart","year":"2008","journal-title":"Genome Res."},{"key":"2023020303304371200_bts654-B22","doi-asserted-by":"crossref","first-page":"16337","DOI":"10.1074\/jbc.M001266200","article-title":"The sorting signals for peroxisomal membrane-bound ascorbate peroxidase are within its C-terminal tail","volume":"275","author":"Mullen","year":"2000","journal-title":"J. Biol. Chem."},{"key":"2023020303304371200_bts654-B23","doi-asserted-by":"crossref","first-page":"899","DOI":"10.1016\/0092-8674(87)90086-9","article-title":"A c-terminal signal prevents secretion of luminal er proteins","volume":"48","author":"Munro","year":"1987","journal-title":"Cell"},{"key":"2023020303304371200_bts654-B24","doi-asserted-by":"crossref","first-page":"1152","DOI":"10.1093\/bioinformatics\/btq106","article-title":"Localized motif discovery in gene regulatory sequences","volume":"26","author":"Narang","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020303304371200_bts654-B25","doi-asserted-by":"crossref","first-page":"567","DOI":"10.1016\/S0022-2836(03)00318-8","article-title":"Motif refinement of the peroxisomal targeting signal 1 and evaluation of taxon-specific differences","volume":"328","author":"Neuberger","year":"2003","journal-title":"J. Mol. Biol."},{"key":"2023020303304371200_bts654-B26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2002-3-12-research0087","article-title":"Computational analysis of core promoters in the Drosophila genome","volume":"3","author":"Ohler","year":"2002","journal-title":"Genome Biol."},{"key":"2023020303304371200_bts654-B27","doi-asserted-by":"crossref","first-page":"W199","DOI":"10.1093\/nar\/gkh465","article-title":"Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes","volume":"32","author":"Pavesi","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023020303304371200_bts654-B28","doi-asserted-by":"crossref","first-page":"20285","DOI":"10.1074\/jbc.M109.004960","article-title":"An endoplasmic reticulum retention signal located in the extracellular amino-terminal domain of the NR2A subunit of N-Methyl-D-aspartate receptors","volume":"284","author":"Qiu","year":"2009","journal-title":"J. Biol. Chem."},{"key":"2023020303304371200_bts654-B29","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1186\/1471-2105-8-385","article-title":"Discriminative motif discovery in DNA andproteinsequences using the DEME algorithm","volume":"8","author":"Redhead","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023020303304371200_bts654-B30","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1016\/j.gene.2005.09.033","article-title":"Identification of highly specific localized sequence motifs in human ribosomal protein gene promoters","volume":"365","author":"Roepcke","year":"2006","journal-title":"Gene"},{"key":"2023020303304371200_bts654-B31","doi-asserted-by":"crossref","first-page":"22084","DOI":"10.1016\/S0021-9258(20)80651-6","article-title":"Intracellular retention of interleukin-6 abrogates signaling","volume":"268","author":"Rose-John","year":"1993","journal-title":"J. Biol. Chem."},{"key":"2023020303304371200_bts654-B32","doi-asserted-by":"crossref","first-page":"32327","DOI":"10.1074\/jbc.M706793200","article-title":"Nucleocytoplasmic shuttling of the zinc finger protein EZI is mediated by importin-7-dependent nuclear import and CRM1-independent export mechanisms","volume":"282","author":"Saijou","year":"2007","journal-title":"J. Biol. Chem"},{"key":"2023020303304371200_bts654-B33","doi-asserted-by":"crossref","first-page":"D161","DOI":"10.1093\/nar\/gkp885","article-title":"PROSITE, a protein domain database for functional characterization and annotation","volume":"38","author":"Sigrist","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023020303304371200_bts654-B34","doi-asserted-by":"crossref","first-page":"D535","DOI":"10.1093\/nar\/gkj109","article-title":"BioGRID: a general repository for interaction datasets","volume":"34","author":"Stark","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023020303304371200_bts654-B35","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1089\/10665270252935566","article-title":"A Gibbs sampling method to detect overrepresented motifs in the upstream regions of coexpressed genes","volume":"9","author":"Thijs","year":"2002","journal-title":"J. Comput. Biol."},{"key":"2023020303304371200_bts654-B36","doi-asserted-by":"crossref","first-page":"3203","DOI":"10.1093\/nar\/gkm201","article-title":"Position and distance specificity are important determinants of cis-regulatory motifs in addition to evolutionary conservation","volume":"35","author":"Vardhanabhuti","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023020303304371200_bts654-B37","first-page":"60","article-title":"A the large-sample distribution of the likelihood ratio for testing composite hypotheses","volume":"1","author":"Wilks","year":"1938","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020303304371200_bts654-B38","doi-asserted-by":"crossref","first-page":"7145","DOI":"10.1073\/pnas.0701811104","article-title":"Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites","volume":"104","author":"Xie","year":"2007","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020303304371200_bts654-B39","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1111\/j.1365-2443.2005.00850.x","article-title":"Zinc finger domain of Snail functions as a nuclear localization signal for importin \u03b2-mediated nuclear import pathway","volume":"10","author":"Yamasaki","year":"2005","journal-title":"Genes Cells"},{"key":"2023020303304371200_bts654-B40","doi-asserted-by":"crossref","first-page":"2054","DOI":"10.1093\/bioinformatics\/btr353","article-title":"A tree-based approach for motif discovery and sequence classification","volume":"27","author":"Yan","year":"2011","journal-title":"Bioinformatics"},{"key":"2023020303304371200_bts654-B41","doi-asserted-by":"crossref","first-page":"15412","DOI":"10.1074\/jbc.270.25.15412","article-title":"Addition of an endoplasmic reticulum retention\/retrieval signal does not block maturation of enzymatically active peptidylglycine alpha-amidating monooxygenase","volume":"270","author":"Yun","year":"1995","journal-title":"J. Biol. Chem."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/29\/1\/39\/49060222\/bioinformatics_29_1_39.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/29\/1\/39\/49060222\/bioinformatics_29_1_39.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,3]],"date-time":"2023-02-03T03:31:23Z","timestamp":1675395083000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/29\/1\/39\/273361"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,11,9]]},"references-count":41,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2013,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts654","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"type":"electronic","value":"1367-4811"},{"type":"print","value":"1367-4803"}],"subject":[],"published-other":{"date-parts":[[2013,1]]},"published":{"date-parts":[[2012,11,9]]}}}