{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T19:09:27Z","timestamp":1774552167039,"version":"3.50.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Cys2His2 zinc finger (ZF) proteins represent the largest class of eukaryotic transcription factors. Their modular structure and well-conserved protein-DNA interface allow the development of computational approaches for predicting their DNA-binding preferences even when no binding sites are known for a particular protein. The \u2018canonical model\u2019 for ZF protein-DNA interaction consists of only four amino acid nucleotide contacts per zinc finger domain.<\/jats:p>\n               <jats:p>Results: We present an approach for predicting ZF binding based on support vector machines (SVMs). While most previous computational approaches have been based solely on examples of known ZF protein\u2013DNA interactions, ours additionally incorporates information about protein\u2013DNA pairs known to bind weakly or not at all. Moreover, SVMs with a linear kernel can naturally incorporate constraints about the relative binding affinities of protein-DNA pairs; this type of information has not been used previously in predicting ZF protein-DNA binding. Here, we build a high-quality literature-derived experimental database of ZF\u2013DNA binding examples and utilize it to test both linear and polynomial kernels for predicting ZF protein\u2013DNA binding on the basis of the canonical binding model. The polynomial SVM outperforms previously published prediction procedures as well as the linear SVM. This may indicate the presence of dependencies between contacts in the canonical binding model and suggests that modification of the underlying structural model may result in further improved performance in predicting ZF protein\u2013DNA binding. Overall, this work demonstrates that methods incorporating information about non-binding and relative binding of protein\u2013DNA pairs have great potential for effective prediction of protein\u2013DNA interactions.<\/jats:p>\n               <jats:p>Availability: An online tool for predicting ZF DNA binding is available at http:\/\/compbio.cs.princeton.edu\/zf\/.<\/jats:p>\n               <jats:p>Contact: \u00a0mona@cs.princeton.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn580","type":"journal-article","created":{"date-parts":[[2008,11,14]],"date-time":"2008-11-14T01:25:06Z","timestamp":1226625906000},"page":"22-29","source":"Crossref","is-referenced-by-count":117,"title":["Predicting DNA recognition by Cys2His2 zinc finger proteins"],"prefix":"10.1093","volume":"25","author":[{"given":"Anton V.","family":"Persikov","sequence":"first","affiliation":[{"name":"1 Lewis-Sigler Institute for Integrative Genomics and 2Department of Computer Science, Princeton University, Princeton, NJ 08544, USA"}]},{"given":"Robert","family":"Osada","sequence":"additional","affiliation":[{"name":"1 Lewis-Sigler Institute for Integrative Genomics and 2Department of Computer Science, Princeton University, Princeton, NJ 08544, USA"}]},{"given":"Mona","family":"Singh","sequence":"additional","affiliation":[{"name":"1 Lewis-Sigler Institute for Integrative Genomics and 2Department of Computer Science, Princeton University, Princeton, NJ 08544, USA"},{"name":"1 Lewis-Sigler Institute for Integrative Genomics and 2Department of Computer Science, Princeton University, Princeton, NJ 08544, USA"}]}],"member":"286","published-online":{"date-parts":[[2008,11,13]]},"reference":[{"key":"2023013110024458000_B1","first-page":"115","article-title":"SAMIE: statistical algorithm for modeling interaction energies","volume":"6","author":"Benos","year":"2001","journal-title":"Pac. Symp. Biocomput."},{"key":"2023013110024458000_B2","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1016\/S0022-2836(02)00917-8","article-title":"Probabilistic code for DNA recognition by proteins of the EGR family","volume":"323","author":"Benos","year":"2002","journal-title":"J. Mol. Biol."},{"key":"2023013110024458000_B3","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The Protein Data Bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023013110024458000_B4","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1038\/nbt794","article-title":"Scanning the human genome with combinatorial transcription factor libraries","volume":"21","author":"Blancafort","year":"2003","journal-title":"Nat. Biotechnol."},{"key":"2023013110024458000_B5","doi-asserted-by":"crossref","first-page":"7158","DOI":"10.1073\/pnas.111163698","article-title":"Exploring the DNA-binding specificities of zinc fingers with DNA microarrays","volume":"98","author":"Bulyk","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110024458000_B6","doi-asserted-by":"crossref","first-page":"1255","DOI":"10.1093\/nar\/30.5.1255","article-title":"Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors","volume":"30","author":"Bulyk","year":"2002","journal-title":"Nucleic Acids Res."},{"key":"2023013110024458000_B7","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511801389","volume-title":"An Introduction to Support Vector Machines: And Other Kernel-based Learning Methods","author":"Cristianini","year":"2000"},{"key":"2023013110024458000_B8","doi-asserted-by":"crossref","first-page":"489","DOI":"10.1006\/jmbi.2000.4133","article-title":"Insights into the molecular recognition of the 5\u2032-GNN-3\u2032family of DNA sequences by zinc finger domains","volume":"303","author":"Dreier","year":"2000","journal-title":"J. Mol. Biol."},{"key":"2023013110024458000_B9","doi-asserted-by":"crossref","first-page":"29466","DOI":"10.1074\/jbc.M102604200","article-title":"Development of zinc finger domains for recognition of the 5\u2032-ANN-3\u2032family of DNA sequences and their use in the construction of artificial transcription factors","volume":"276","author":"Dreier","year":"2001","journal-title":"J. Biol. Chem."},{"key":"2023013110024458000_B10","doi-asserted-by":"crossref","first-page":"35588","DOI":"10.1074\/jbc.M506654200","article-title":"Development of zinc finger domains for recognition of the 5\u2032-CNN-3\u2032family DNA sequences and their use in the construction of artificial transcription factors","volume":"280","author":"Dreier","year":"2005","journal-title":"J. Biol. Chem."},{"key":"2023013110024458000_B11","doi-asserted-by":"crossref","first-page":"1171","DOI":"10.1016\/S0969-2126(96)00125-6","article-title":"Zif268 protein-DNA complex refined at 1.6 A: a model system for understanding zinc finger\u2013DNA interactions","volume":"4","author":"Elrod-Erickson","year":"1996","journal-title":"Structure"},{"key":"2023013110024458000_B12","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1016\/S0969-2126(98)00047-1","article-title":"High-resolution structures of variant Zif268-DNA complexes: implications for understanding zinc finger-DNA recognition","volume":"6","author":"Elrod-Erickson","year":"1998","journal-title":"Structure"},{"key":"2023013110024458000_B13","doi-asserted-by":"crossref","first-page":"061921","DOI":"10.1103\/PhysRevE.73.061921","article-title":"Weight matrices for protein\u2013DNA binding sites from a single co-crystal structure","volume":"73","author":"Endres","year":"2006","journal-title":"Phys. Rev. E. Stat. Nonlin. Soft Matter Phys."},{"key":"2023013110024458000_B14","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An introduction to ROC analysis","volume":"27","author":"Fawcett","year":"2006","journal-title":"Pattern Recogn. Lett."},{"key":"2023013110024458000_B15","doi-asserted-by":"crossref","first-page":"R11","DOI":"10.1186\/gb-2004-5-2-r11","article-title":"Predicting specificity in bZIP coiled-coil protein interactions","volume":"5","author":"Fong","year":"2004","journal-title":"Genome Biol."},{"key":"2023013110024458000_B16","doi-asserted-by":"crossref","first-page":"1325","DOI":"10.1093\/bioinformatics\/btn198","article-title":"Eukaryotic transcription factor binding sites\u2014modeling and integrative search methods","volume":"24","author":"Hannenhalli","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013110024458000_B17","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1038\/nature02800","article-title":"Transcriptional regulatory code of a eukaryotic genome","volume":"431","author":"Harbison","year":"2004","journal-title":"Nature"},{"key":"2023013110024458000_B18","doi-asserted-by":"crossref","first-page":"625","DOI":"10.1007\/PL00000885","article-title":"Three classes of C2H2 zinc finger proteins","volume":"58","author":"Iuchi","year":"2001","journal-title":"Cell Mol. Life Sci."},{"key":"2023013110024458000_B19","first-page":"376","article-title":"Making large-scale SVM learning practical","volume-title":"Advances in Kernel Methods : Support Vector Learning","author":"Joachims","year":"1999"},{"key":"2023013110024458000_B20","doi-asserted-by":"crossref","first-page":"e1","DOI":"10.1371\/journal.pcbi.0010001","article-title":"Ab initio prediction of transcription factor targets using structural knowledge","volume":"1","author":"Kaplan","year":"2005","journal-title":"PLoS Comput. Biol."},{"key":"2023013110024458000_B21","doi-asserted-by":"crossref","first-page":"1850","DOI":"10.1093\/bioinformatics\/btn331","article-title":"Context-dependent DNA recognition code for C2H2 zinc-finger transcription factors","volume":"24","author":"Liu","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013110024458000_B22","doi-asserted-by":"crossref","first-page":"546","DOI":"10.1093\/nar\/gki204","article-title":"Quantitative evaluation of protein-DNA interactions using an optimized knowledge-based potential","volume":"33","author":"Liu","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023013110024458000_B23","doi-asserted-by":"crossref","first-page":"2860","DOI":"10.1093\/nar\/29.13.2860","article-title":"Amino acid-base interactions: a three-dimensional analysis of protein\u2013DNA interactions at an atomic level","volume":"29","author":"Luscombe","year":"2001","journal-title":"Nucleic Acids Res."},{"key":"2023013110024458000_B24","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1016\/j.molcel.2008.06.016","article-title":"Rapid \u2018open-source\u2019 engineering of customized zinc-finger nucleases for highly efficient gene modification","volume":"31","author":"Maeder","year":"2008","journal-title":"Mol. Cell"},{"key":"2023013110024458000_B25","doi-asserted-by":"crossref","first-page":"2306","DOI":"10.1093\/nar\/26.10.2306","article-title":"Quantitative parameters for amino acid-base interaction: implications for prediction of protein\u2013DNA binding sites","volume":"26","author":"Mandel-Gutfreund","year":"1998","journal-title":"Nucleic Acids Res."},{"key":"2023013110024458000_B26","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1093\/nar\/gkg108","article-title":"TRANSFAC: transcriptional regulation, from patterns to profiles","volume":"31","author":"Matys","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023013110024458000_B27","doi-asserted-by":"crossref","first-page":"5781","DOI":"10.1093\/nar\/gki875","article-title":"Protein-DNA binding specificity predictions with structural models","volume":"33","author":"Morozov","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023013110024458000_B28","doi-asserted-by":"crossref","first-page":"1331","DOI":"10.1038\/ng1473","article-title":"Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays","volume":"36","author":"Mukherjee","year":"2004","journal-title":"Nat. Genet."},{"key":"2023013110024458000_B29","doi-asserted-by":"crossref","first-page":"2938","DOI":"10.1073\/pnas.95.6.2938","article-title":"Differing roles for zinc fingers in DNA recognition: structure of a six-finger transcription factor IIIA complex","volume":"95","author":"Nolte","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110024458000_B30","doi-asserted-by":"crossref","first-page":"e89","DOI":"10.1371\/journal.pgen.0030089","article-title":"Genome-wide analysis of KAP1 binding suggests autoregulation of KRAB-ZNFs","volume":"3","author":"O'Geen","year":"2007","journal-title":"PLoS Genet."},{"key":"2023013110024458000_B31","doi-asserted-by":"crossref","first-page":"3516","DOI":"10.1093\/bioinformatics\/bth438","article-title":"Comparative analysis of methods for representing and searching for transcription factor binding sites","volume":"20","author":"Osada","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013110024458000_B32","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1146\/annurev.biochem.70.1.313","article-title":"Design and selection of novel Cys2His2 zinc finger proteins","volume":"70","author":"Pabo","year":"2001","journal-title":"Annu. Rev. Biochem."},{"key":"2023013110024458000_B33","doi-asserted-by":"crossref","first-page":"809","DOI":"10.1126\/science.2028256","article-title":"Zinc finger-DNA recognition: crystal structure of a Zif268\u2013DNA complex at 2.1 A","volume":"252","author":"Pavletich","year":"1991","journal-title":"Science"},{"key":"2023013110024458000_B34","doi-asserted-by":"crossref","first-page":"1701","DOI":"10.1126\/science.8378770","article-title":"Crystal structure of a five-finger GLI\u2013DNA complex: new perspectives on zinc fingers","volume":"261","author":"Pavletich","year":"1993","journal-title":"Science"},{"key":"2023013110024458000_B35","doi-asserted-by":"crossref","first-page":"2758","DOI":"10.1073\/pnas.96.6.2758","article-title":"Toward controlling gene expression at will: selection and design of zinc finger domains recognizing each of the 5\u2032-GNN-3\u2032DNA target sequences","volume":"96","author":"Segal","year":"1999","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110024458000_B36","doi-asserted-by":"crossref","first-page":"1085","DOI":"10.1093\/nar\/gkl1155","article-title":"Structure-based prediction of C2H2 zinc-finger binding specificity: sensitivity to docking geometry","volume":"35","author":"Siggers","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023013110024458000_B37","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1093\/bioinformatics\/16.1.16","article-title":"DNA binding sites: representation and discovery","volume":"16","author":"Stormo","year":"2000","journal-title":"Bioinformatics"},{"key":"2023013110024458000_B38","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1093\/protein\/8.4.319","article-title":"DNA recognition code of transcription factors","volume":"8","author":"Suzuki","year":"1995","journal-title":"Protein Eng."},{"key":"2023013110024458000_B39","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-2440-0","volume-title":"The Nature of Statistical Learning Theory","author":"Vapnik","year":"1995"},{"key":"2023013110024458000_B40","doi-asserted-by":"crossref","first-page":"1304","DOI":"10.1126\/science.1058040","article-title":"The sequence of the human genome","volume":"291","author":"Venter","year":"2001","journal-title":"Science"},{"key":"2023013110024458000_B41","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1146\/annurev.biophys.29.1.183","article-title":"DNA recognition by Cys2His2 zinc finger proteins","volume":"29","author":"Wolfe","year":"2000","journal-title":"Annu. Rev. Biophys. Biomol. Struct."},{"key":"2023013110024458000_B42","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1016\/S0969-2126(01)00632-3","article-title":"Beyond the \u201crecognition code\u201d: structures of two Cys2His2 zinc finger\/TATA box complexes","volume":"9","author":"Wolfe","year":"2001","journal-title":"Structure"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/1\/22\/48982414\/bioinformatics_25_1_22.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/1\/22\/48982414\/bioinformatics_25_1_22.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T18:53:46Z","timestamp":1675191226000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/1\/22\/303146"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,11,13]]},"references-count":42,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn580","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,1,1]]},"published":{"date-parts":[[2008,11,13]]}}}