{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,1]],"date-time":"2026-01-01T14:12:33Z","timestamp":1767276753577},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"21","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The position-weight matrix (PWM) is a useful representation of a transcription factor binding site (TFBS) sequence pattern because the PWM can be estimated from a small number of representative TFBS sequences. However, because the PWM probability model assumes independence between individual nucleotide positions, the PWMs for some TFs poorly discriminate binding sites from non-binding-sites that have similar sequence content. Since the local three-dimensional DNA structure (\u2018shape\u2019) is a determinant of TF binding specificity and since DNA shape has a significant sequence-dependence, we combined DNA shape-derived features into a TF-generalized regulatory score and tested whether the score could improve PWM-based discrimination of TFBS from non-binding-sites.<\/jats:p>\n               <jats:p>Results: We compared a traditional PWM model to a model that combines the PWM with a DNA shape feature-based regulatory potential score, for accuracy in detecting binding sites for 75 vertebrate transcription factors. The PWM + shape model was more accurate than the PWM-only model, for 45% of TFs tested, with no significant loss of accuracy for the remaining TFs.<\/jats:p>\n               <jats:p>Availability and implementation: The shape-based model is available as an open-source R package at that is archived on the GitHub software repository at https:\/\/github.com\/ramseylab\/regshape\/.<\/jats:p>\n               <jats:p>Contact: \u00a0stephen.ramsey@oregonstate.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv391","type":"journal-article","created":{"date-parts":[[2015,7,1]],"date-time":"2015-07-01T01:55:14Z","timestamp":1435715714000},"page":"3445-3450","source":"Crossref","is-referenced-by-count":19,"title":["A DNA shape-based regulatory score improves position-weight matrix-based recognition of transcription factor binding sites"],"prefix":"10.1093","volume":"31","author":[{"given":"Jichen","family":"Yang","sequence":"first","affiliation":[{"name":"1 Department of Biomedical Sciences and"}]},{"given":"Stephen A.","family":"Ramsey","sequence":"additional","affiliation":[{"name":"1 Department of Biomedical Sciences and"},{"name":"2 School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA"}]}],"member":"286","published-online":{"date-parts":[[2015,6,30]]},"reference":[{"key":"2023020202330026600_btv391-B1","doi-asserted-by":"crossref","first-page":"1429","DOI":"10.1038\/nbt1246","article-title":"Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities","volume":"24","author":"Berger","year":"2006","journal-title":"Nat. Biotechnol."},{"key":"2023020202330026600_btv391-B2","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Machine Learn."},{"key":"2023020202330026600_btv391-B3","doi-asserted-by":"crossref","first-page":"1255","DOI":"10.1093\/nar\/30.5.1255","article-title":"Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors","volume":"30","author":"Bulyk","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023020202330026600_btv391-B4","doi-asserted-by":"crossref","first-page":"2933","DOI":"10.1093\/bioinformatics\/bti473","article-title":"MatInspector and beyond: promoter analysis based on transcription factor binding sites","volume":"21","author":"Cartharius","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020202330026600_btv391-B5","doi-asserted-by":"crossref","first-page":"e63","DOI":"10.1371\/journal.pcbi.0030063","article-title":"Integration of genome and chromatin structure with gene expression profiles to predict c-MYC recognition site binding and function","volume":"3","author":"Chen","year":"2007","journal-title":"PLoS Comput. Biol."},{"key":"2023020202330026600_btv391-B6","first-page":"431","article-title":"The statistical significance of nucleotide position-weight matrix matches","volume":"12","author":"Claverie","year":"1996","journal-title":"Comput. Appl. Biosci."},{"key":"2023020202330026600_btv391-B7","doi-asserted-by":"crossref","first-page":"D91","DOI":"10.1093\/nar\/gkp781","article-title":"3D-footprint: a database for the structural analysis of protein\u2013DNA complexes","volume":"38","author":"Contreras-Moreira","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023020202330026600_btv391-B8","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1093\/bioinformatics\/btr614","article-title":"Epigenetic priors for identifying active transcription factor binding sites","volume":"28","author":"Cuellar-Partida","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020202330026600_btv391-B9","doi-asserted-by":"crossref","first-page":"i101","DOI":"10.1093\/bioinformatics\/bth927","article-title":"Predicting gene regulation by sigma factors in Bacillus subtilis from genome-wide data","volume":"20","author":"de Hoon","year":"2004","journal-title":"Bioinformatics"},{"key":"2023020202330026600_btv391-B10","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1101\/gr.817703","article-title":"Distinguishing regulatory DNA from neutral sites","volume":"13","author":"Elnitski","year":"2003","journal-title":"Genome Res."},{"key":"2023020202330026600_btv391-B11","doi-asserted-by":"crossref","first-page":"526","DOI":"10.1101\/gr.096305.109","article-title":"Integrating multiple evidence sources to predict transcription factor binding in the human genome","volume":"20","author":"Ernst","year":"2010","journal-title":"Genome Res."},{"key":"2023020202330026600_btv391-B12","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1214\/aos\/1016218223","article-title":"Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors)","volume":"28","author":"Friedman","year":"2000","journal-title":"Ann. Stat."},{"key":"2023020202330026600_btv391-B13","doi-asserted-by":"crossref","first-page":"1393","DOI":"10.1074\/jbc.M109.063032","article-title":"Genomic targets of the KRAB and SCAN domain-containing zinc finger protein 263","volume":"285","author":"Frietze","year":"2010","journal-title":"J. Biol. Chem."},{"key":"2023020202330026600_btv391-B14","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1038\/nature11245","article-title":"Architecture of the human regulatory network derived from ENCODE data","volume":"489","author":"Gerstein","year":"2012","journal-title":"Nature"},{"key":"2023020202330026600_btv391-B15","first-page":"83","article-title":"Integrating genomic data to predict transcription factor binding","volume":"16","author":"Holloway","year":"2005","journal-title":"Genome Inf."},{"key":"2023020202330026600_btv391-B16","doi-asserted-by":"crossref","first-page":"e106","DOI":"10.1093\/nar\/gks283","article-title":"A flexible integrative approach based on random forest improves prediction of transcription factor binding sites","volume":"40","author":"Hooghe","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023020202330026600_btv391-B17","doi-asserted-by":"crossref","first-page":"1497","DOI":"10.1126\/science.1141319","article-title":"Genome-wide mapping of in\u00a0vivo protein\u2013DNA interactions","volume":"316","author":"Johnson","year":"2007","journal-title":"Science"},{"key":"2023020202330026600_btv391-B18","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1080\/07391102.1988.10506483","article-title":"The definition of generalized helicoidal parameters and of axis curvature for irregular nucleic acids","volume":"6","author":"Lavery","year":"1988","journal-title":"J. Biomol. Struct. Dyn."},{"key":"2023020202330026600_btv391-B19","doi-asserted-by":"crossref","first-page":"e1820","DOI":"10.1371\/journal.pone.0001820","article-title":"Probabilistic inference of transcription factor binding from multiple data sources","volume":"3","author":"L\u00e4hdesm\u00e4ki","year":"2008","journal-title":"PLoS ONE"},{"key":"2023020202330026600_btv391-B20","doi-asserted-by":"crossref","first-page":"669","DOI":"10.1038\/nrg2641","article-title":"ChIP-seq: advantages and challenges of a maturing technology","volume":"10","author":"Park","year":"2009","journal-title":"Nat. Rev. Genet."},{"key":"2023020202330026600_btv391-B21","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1101\/gr.112623.110","article-title":"Accurate inference of transcription factor binding from DNA sequence and chromatin data","volume":"21","author":"Pique-Regi","year":"2011","journal-title":"Genome Res."},{"key":"2023020202330026600_btv391-B22","doi-asserted-by":"crossref","first-page":"2071","DOI":"10.1093\/bioinformatics\/btq405","article-title":"Genome-wide histone acetylation data improve prediction of mammalian transcription factor binding sites","volume":"26","author":"Ramsey","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020202330026600_btv391-B23","doi-asserted-by":"crossref","first-page":"1248","DOI":"10.1038\/nature08473","article-title":"The role of DNA shape in protein\u2013DNA recognition","volume":"461","author":"Rohs","year":"2009","journal-title":"Nature"},{"key":"2023020202330026600_btv391-B24","doi-asserted-by":"crossref","first-page":"939","DOI":"10.1038\/nbt1098-939","article-title":"Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation","volume":"16","author":"Roth","year":"1998","journal-title":"Nat. Biotechnol."},{"key":"2023020202330026600_btv391-B25","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1038\/nbt.1518","article-title":"PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls","volume":"27","author":"Rozowsky","year":"2009","journal-title":"Nat. Biotechnol."},{"key":"2023020202330026600_btv391-B26","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1038\/nature04979","article-title":"A genomic code for nucleosome positioning","volume":"442","author":"Segal","year":"2006","journal-title":"Nature"},{"key":"2023020202330026600_btv391-B27","doi-asserted-by":"crossref","first-page":"W555","DOI":"10.1093\/nar\/gkl224","article-title":"Stubb: a program for discovery and analysis of cis-regulatory modules","volume":"34","author":"Sinha","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023020202330026600_btv391-B28","doi-asserted-by":"crossref","first-page":"505","DOI":"10.1093\/nar\/12.1Part2.505","article-title":"Computer methods to locate signals in nucleic acid sequences","volume":"12","author":"Staden","year":"1984","journal-title":"Nucleic Acids Res."},{"key":"2023020202330026600_btv391-B29","doi-asserted-by":"crossref","first-page":"2997","DOI":"10.1093\/nar\/10.9.2997","article-title":"Use of the \u2018Perceptron\u2019 algorithm to distinguish translational initiation sites in E","volume":"10","author":"Stormo","year":"1982","journal-title":"coli. Nucleic Acids Res."},{"key":"2023020202330026600_btv391-B30","doi-asserted-by":"crossref","first-page":"2508","DOI":"10.1002\/j.1460-2075.1996.tb00608.x","article-title":"Acetylation of histone H4 plays a primary role in enhancing transcription factor binding to nucleosomal DNA in\u00a0vitro","volume":"15","author":"Vettese-Dadey","year":"1996","journal-title":"EMBO J."},{"key":"2023020202330026600_btv391-B31","first-page":"281","article-title":"Support vector method for function approximation, regression estimation, and signal processing","volume":"9","author":"Vapnik","year":"1996","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"2023020202330026600_btv391-B32","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1038\/nrg1315","article-title":"Applied bioinformatics for the identification of regulatory elements","volume":"5","author":"Wasserman","year":"2004","journal-title":"Nat. Rev. Genet."},{"key":"2023020202330026600_btv391-B33","doi-asserted-by":"crossref","first-page":"R7","DOI":"10.1186\/gb-2010-11-1-r7","article-title":"Genome-wide prediction of transcription factor binding sites using an integrated model","volume":"11","author":"Won","year":"2010","journal-title":"Genome Biol."},{"key":"2023020202330026600_btv391-B34","doi-asserted-by":"crossref","first-page":"1325","DOI":"10.1101\/gr.072769.107","article-title":"Cross-species de novo identification of cis-regulatory modules with GibbsModule: application to gene regulation in embryonic stem cells","volume":"18","author":"Xie","year":"2008","journal-title":"Genome Res."},{"key":"2023020202330026600_btv391-B35","doi-asserted-by":"crossref","first-page":"D148","DOI":"10.1093\/nar\/gkt1087","article-title":"TFBSshape: a motif database for DNA shape features of transcription factor binding sites","volume":"42","author":"Yang","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2023020202330026600_btv391-B36","doi-asserted-by":"crossref","first-page":"W56","DOI":"10.1093\/nar\/gkt437","article-title":"DNAshape: a method for the high-throughput prediction of DNA structural features on a genomic scale","volume":"41","author":"Zhou","year":"2013","journal-title":"Nucleic Acids Res."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/21\/3445\/49035979\/bioinformatics_31_21_3445.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/21\/3445\/49035979\/bioinformatics_31_21_3445.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T03:52:54Z","timestamp":1675309974000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/21\/3445\/194637"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,6,30]]},"references-count":36,"journal-issue":{"issue":"21","published-print":{"date-parts":[[2015,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv391","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,11,1]]},"published":{"date-parts":[[2015,6,30]]}}}