{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T23:12:51Z","timestamp":1772493171546,"version":"3.50.1"},"reference-count":45,"publisher":"Oxford University Press (OUP)","issue":"20","license":[{"start":{"date-parts":[[2019,3,23]],"date-time":"2019-03-23T00:00:00Z","timestamp":1553299200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"NIH BRAIN Initiative","award":["U01NS094330"],"award-info":[{"award-number":["U01NS094330"]}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>We set out to develop an algorithm that can mine differential gene expression data to identify candidate cell type-specific DNA regulatory sequences. Differential expression is usually quantified as a continuous score\u2014fold-change, test-statistic, P-value\u2014comparing biological classes. Unlike existing approaches, our de novo strategy, termed SArKS, applies non-parametric kernel smoothing to uncover promoter motif sites that correlate with elevated differential expression scores. SArKS detects motif k-mers by smoothing sequence scores over sequence similarity. A second round of smoothing over spatial proximity reveals multi-motif domains (MMDs). Discovered motif sites can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We applied SArKS to published gene expression data representing distinct neocortical neuron classes in Mus musculus and interneuron developmental states in Homo sapiens. When benchmarked against several existing algorithms using a cross-validation procedure, SArKS identified larger motif sets that formed the basis for regression models with higher correlative power.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>https:\/\/github.com\/denniscwylie\/sarks.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz198","type":"journal-article","created":{"date-parts":[[2019,3,20]],"date-time":"2019-03-20T16:23:04Z","timestamp":1553098984000},"page":"3944-3952","source":"Crossref","is-referenced-by-count":2,"title":["SArKS:\n                    <i>de novo<\/i>\n                    discovery of gene expression regulatory motif sites and domains by suffix array kernel smoothing"],"prefix":"10.1093","volume":"35","author":[{"given":"Dennis C","family":"Wylie","sequence":"first","affiliation":[{"name":"Center for Computational Biology and Bioinformatics, University of Texas at Austin , Austin, TX, USA"}]},{"given":"Hans A","family":"Hofmann","sequence":"additional","affiliation":[{"name":"Center for Computational Biology and Bioinformatics, University of Texas at Austin , Austin, TX, USA"},{"name":"Institute for Cellular and Molecular Biology, University of Texas at Austin , Austin, TX, USA"},{"name":"Department of Integrative Biology, University of Texas at Austin , Austin, TX, USA"},{"name":"Institute for Neuroscience, University of Texas at Austin , Austin, TX, USA"}]},{"given":"Boris V","family":"Zemelman","sequence":"additional","affiliation":[{"name":"Institute for Cellular and Molecular Biology, University of Texas at Austin , Austin, TX, USA"},{"name":"Institute for Neuroscience, University of Texas at Austin , Austin, TX, USA"},{"name":"Department of Neuroscience, University of Texas at Austin , Austin, TX, USA"},{"name":"Center for Learning and Memory, University of Texas at Austin , Austin, TX, USA"}]}],"member":"286","published-online":{"date-parts":[[2019,3,23]]},"reference":[{"key":"2023013108275592900_btz198-B1","doi-asserted-by":"crossref","first-page":"1720","DOI":"10.1126\/science.1162327","article-title":"Diversity and complexity in DNA recognition by transcription factors","volume":"324","author":"Badis","year":"2009","journal-title":"Science"},{"key":"2023013108275592900_btz198-B2","doi-asserted-by":"crossref","first-page":"1653","DOI":"10.1093\/bioinformatics\/btr261","article-title":"DREME: motif discovery in transcription factor ChIP-seq data","volume":"27","author":"Bailey","year":"2011","journal-title":"Bioinformatics"},{"key":"2023013108275592900_btz198-B3","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1007\/BF00993379","article-title":"Unsupervised learning of multiple motifs in biopolymers using expectation maximization","volume":"21","author":"Bailey","year":"1995","journal-title":"Mach. Learn"},{"key":"2023013108275592900_btz198-B4","doi-asserted-by":"crossref","first-page":"W369","DOI":"10.1093\/nar\/gkl198","article-title":"MEME: discovering and analyzing DNA and protein sequence motifs","volume":"34","author":"Bailey","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023013108275592900_btz198-B5","doi-asserted-by":"crossref","first-page":"W202","DOI":"10.1093\/nar\/gkp335","article-title":"MEME SUITE: tools for motif discovery and searching","volume":"37","author":"Bailey","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023013108275592900_btz198-B6","doi-asserted-by":"crossref","first-page":"1035","DOI":"10.1016\/j.neuron.2017.02.014","article-title":"Single-cell profiling of an in vitro model of human interneuron development reveals temporal dynamics of cell type production and maturation","volume":"93","author":"Close","year":"2017","journal-title":"Neuron"},{"key":"2023013108275592900_btz198-B7","doi-asserted-by":"crossref","first-page":"75.","DOI":"10.1038\/378075a0","article-title":"Synchronization of neuronal activity in hippocampus by individual GABAergic interneurons","volume":"378","author":"Cobb","year":"1995","journal-title":"Nature"},{"key":"2023013108275592900_btz198-B8","doi-asserted-by":"crossref","first-page":"3339","DOI":"10.1073\/pnas.0630591100","article-title":"Integrating regulatory motif discovery and genome-wide expression analysis","volume":"100","author":"Conlon","year":"2003","journal-title":"Proc. Natl. Acad. Sci"},{"key":"2023013108275592900_btz198-B9","doi-asserted-by":"crossref","first-page":"11.","DOI":"10.1186\/1471-2105-9-11","article-title":"SeqAn an efficient, generic C++ library for sequence analysis","volume":"9","author":"D\u00f6ring","year":"2008","journal-title":"BMC Bioinf"},{"key":"2023013108275592900_btz198-B10","doi-asserted-by":"crossref","first-page":"aac7247.","DOI":"10.1126\/science.aac7247","article-title":"Retrotransposons as regulators of gene expression","volume":"351","author":"Elbarbary","year":"2016","journal-title":"Science"},{"key":"2023013108275592900_btz198-B11","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1016\/j.molcel.2007.09.027","article-title":"A universal framework for regulatory element discovery across all genomes and data types","volume":"28","author":"Elemento","year":"2007","journal-title":"Mol. Cell"},{"key":"2023013108275592900_btz198-B12","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1038\/nrg1348","article-title":"Microsatellites: simple sequences with complex evolution","volume":"5","author":"Ellegren","year":"2004","journal-title":"Nat. Rev. Genet"},{"key":"2023013108275592900_btz198-B13","doi-asserted-by":"crossref","first-page":"676","DOI":"10.1214\/088342304000000396","article-title":"Permutation methods: a basis for exact inference","volume":"19","author":"Ernst","year":"2004","journal-title":"Stat. Sci"},{"key":"2023013108275592900_btz198-B14","doi-asserted-by":"crossref","first-page":"2303","DOI":"10.1093\/bioinformatics\/btn444","article-title":"Seeder: discriminative seeding DNA motif discovery","volume":"24","author":"Fauteux","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013108275592900_btz198-B15","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1002\/pst.331","article-title":"Consequences of dichotomization","volume":"8","author":"Fedorov","year":"2009","journal-title":"Pharm. Stat"},{"key":"2023013108275592900_btz198-B16","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1080\/00401706.1979.10489751","article-title":"Generalized cross-validation as a method for choosing a good ridge parameter","volume":"21","author":"Golub","year":"1979","journal-title":"Technometrics"},{"key":"2023013108275592900_btz198-B17","doi-asserted-by":"crossref","first-page":"565","DOI":"10.1101\/gr.104471.109","article-title":"Homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers","volume":"20","author":"Gotea","year":"2010","journal-title":"Genome Res"},{"key":"2023013108275592900_btz198-B18","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1093\/bioinformatics\/btr064","article-title":"FIMO: scanning for occurrences of a given motif","volume":"27","author":"Grant","year":"2011","journal-title":"Bioinformatics"},{"key":"2023013108275592900_btz198-B19","doi-asserted-by":"crossref","first-page":"1.","DOI":"10.1186\/gb-2007-8-2-r24","article-title":"Quantifying similarity between motifs","volume":"8","author":"Gupta","year":"2007","journal-title":"Genome Biol"},{"key":"2023013108275592900_btz198-B20","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1016\/j.molcel.2010.05.004","article-title":"Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities","volume":"38","author":"Heinz","year":"2010","journal-title":"Mol. Cell"},{"key":"2023013108275592900_btz198-B21","doi-asserted-by":"crossref","first-page":"2361","DOI":"10.1093\/bioinformatics\/btr412","article-title":"DECOD: fast and accurate discriminative DNA motif finding","volume":"27","author":"Huggins","year":"2011","journal-title":"Bioinformatics"},{"key":"2023013108275592900_btz198-B22","doi-asserted-by":"crossref","first-page":"943","DOI":"10.1007\/3-540-45061-0_73","volume-title":"International Colloquium on Automata, Languages, and Programming","author":"K\u00e4rkk\u00e4inen","year":"2003"},{"key":"2023013108275592900_btz198-B23","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1126\/science.1149381","article-title":"Neuronal diversity and temporal dynamics: the unity of hippocampal circuit operations","volume":"321","author":"Klausberger","year":"2008","journal-title":"Science"},{"key":"2023013108275592900_btz198-B24","doi-asserted-by":"crossref","first-page":"312.","DOI":"10.1038\/nrn1648","article-title":"Cortical inhibitory neurons and schizophrenia","volume":"6","author":"Lewis","year":"2005","journal-title":"Nat. Rev. Neurosci"},{"key":"2023013108275592900_btz198-B25","doi-asserted-by":"crossref","first-page":"835","DOI":"10.1038\/nbt717","article-title":"An algorithm for finding protein\u2013DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments","volume":"20","author":"Liu","year":"2002","journal-title":"Nat. Biotechnol"},{"key":"2023013108275592900_btz198-B26","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1016\/S0065-2660(07)00010-7","article-title":"Genomic identification of regulatory elements by evolutionary sequence comparison and functional analysis","volume":"61","author":"Loots","year":"2008","journal-title":"Adv. Genet"},{"key":"2023013108275592900_btz198-B27","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1089\/106652700750050826","article-title":"Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification","volume":"7","author":"Marsan","year":"2000","journal-title":"J. Comput. Biol"},{"key":"2023013108275592900_btz198-B28","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1146\/annurev.genom.7.080505.115623","article-title":"Transcriptional regulatory elements in the human genome","volume":"7","author":"Maston","year":"2006","journal-title":"Annu. Rev. Genomics Hum. Genet"},{"key":"2023013108275592900_btz198-B29","doi-asserted-by":"crossref","first-page":"D110","DOI":"10.1093\/nar\/gkv1176","article-title":"JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles","volume":"44","author":"Mathelier","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023013108275592900_btz198-B30","doi-asserted-by":"crossref","first-page":"1369","DOI":"10.1016\/j.neuron.2015.05.018","article-title":"Epigenomic signatures of neuronal diversity in the mammalian brain","volume":"86","author":"Mo","year":"2015","journal-title":"Neuron"},{"key":"2023013108275592900_btz198-B31","doi-asserted-by":"crossref","DOI":"10.1038\/msb4100054","article-title":"Deciphering principles of transcription regulation in eukaryotic genomes","volume":"2","author":"Nguyen","year":"2006","journal-title":"Mol. Systems Biol"},{"key":"2023013108275592900_btz198-B32","doi-asserted-by":"crossref","first-page":"S207","DOI":"10.1093\/bioinformatics\/17.suppl_1.S207","article-title":"An algorithm for finding signals of unknown length in DNA sequences","volume":"17","author":"Pavesi","year":"2001","journal-title":"Bioinformatics"},{"key":"2023013108275592900_btz198-B33","doi-asserted-by":"crossref","first-page":"W199","DOI":"10.1093\/nar\/gkh465","article-title":"Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes","volume":"32","author":"Pavesi","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023013108275592900_btz198-B34","doi-asserted-by":"crossref","first-page":"1.","DOI":"10.1186\/1471-2105-8-385","article-title":"Discriminative motif discovery in DNA and protein sequences using the DEME algorithm","volume":"8","author":"Redhead","year":"2007","journal-title":"BMC Bioinf"},{"key":"2023013108275592900_btz198-B35","doi-asserted-by":"crossref","first-page":"e126.","DOI":"10.1093\/nar\/gkr574","article-title":"STEME: efficient EM to find motifs in large data sets","volume":"39","author":"Reid","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023013108275592900_btz198-B36","doi-asserted-by":"crossref","first-page":"e90735.","DOI":"10.1371\/journal.pone.0090735","article-title":"STEME: a robust, accurate motif finder for large data sets","volume":"9","author":"Reid","year":"2014","journal-title":"PLoS One"},{"key":"2023013108275592900_btz198-B37","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1007\/BFb0054337","volume-title":"Latin American Symposium on Theoretical Informatics","author":"Sagot","year":"1998"},{"key":"2023013108275592900_btz198-B38","doi-asserted-by":"crossref","first-page":"822","DOI":"10.1089\/cmb.2005.12.822","article-title":"A discriminative model for identifying spatial cis-regulatory modules","volume":"12","author":"Segal","year":"2005","journal-title":"J. Comput. Biol"},{"key":"2023013108275592900_btz198-B39","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1145\/565196.565231","article-title":"From promoter sequence to expression: a probabilistic framework","volume-title":"Proceedings of the Sixth Annual International Conference on Computational Biology","author":"Segal","year":"2002"},{"key":"2023013108275592900_btz198-B40","doi-asserted-by":"crossref","first-page":"599","DOI":"10.1089\/10665270360688219","article-title":"Discriminative motifs","volume":"10","author":"Sinha","year":"2003","journal-title":"J. Comput. Biol"},{"key":"2023013108275592900_btz198-B41","doi-asserted-by":"crossref","first-page":"973","DOI":"10.1534\/genetics.112.143370","article-title":"Why transcription factor binding sites are ten nucleotides long","volume":"192","author":"Stewart","year":"2012","journal-title":"Genetics"},{"key":"2023013108275592900_btz198-B42","doi-asserted-by":"crossref","first-page":"e1000562.","DOI":"10.1371\/journal.pcbi.1000562","article-title":"Discovery of regulatory elements is improved by a discriminatory approach","volume":"5","author":"Valen","year":"2009","journal-title":"PLoS Comput. Biol"},{"key":"2023013108275592900_btz198-B43","doi-asserted-by":"crossref","first-page":"1445","DOI":"10.1101\/gr.5321506","article-title":"Unraveling transcription regulatory networks by protein\u2013DNA and protein\u2013protein interaction mapping","volume":"16","author":"Walhout","year":"2006","journal-title":"Genome Res"},{"key":"2023013108275592900_btz198-B44","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1038\/nrg1315","article-title":"Applied bioinformatics for the identification of regulatory elements","volume":"5","author":"Wasserman","year":"2004","journal-title":"Nat. Rev. Genet"},{"key":"2023013108275592900_btz198-B45","doi-asserted-by":"crossref","first-page":"775","DOI":"10.1093\/bioinformatics\/btt615","article-title":"Discriminative motif analysis of high-throughput dataset","volume":"30","author":"Yao","year":"2014","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz198\/28554768\/btz198.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/20\/3944\/48976331\/bioinformatics_35_20_3944.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/20\/3944\/48976331\/bioinformatics_35_20_3944.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T12:17:55Z","timestamp":1675167475000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/20\/3944\/5418797"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,3,23]]},"references-count":45,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2019,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz198","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/133934","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,10,15]]},"published":{"date-parts":[[2019,3,23]]}}}