{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T12:18:50Z","timestamp":1767961130440,"version":"3.49.0"},"reference-count":53,"publisher":"Oxford University Press (OUP)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Thousands of proteins are known to bind to DNA; for most of them the mechanism of action and the residues that bind to DNA, i.e. the binding sites, are yet unknown. Experimental identification of binding sites requires expensive and laborious methods such as mutagenesis and binding essays. Hence, such studies are not applicable on a large scale. If the 3D structure of a protein is known, it is often possible to predict DNA-binding sites in silico. However, for most proteins, such knowledge is not available.<\/jats:p>\n               <jats:p>Results: It has been shown that DNA-binding residues have distinct biophysical characteristics. Here we demonstrate that these characteristics are so distinct that they enable accurate prediction of the residues that bind DNA directly from amino acid sequence, without requiring any additional experimental or structural information. In a cross-validation based on the largest non-redundant dataset of high-resolution protein\u2013DNA complexes available today, we found that 89% of our predictions are confirmed by experimental data. Thus, it is now possible to identify DNA-binding sites on a proteomic scale even in the absence of any experimental data or 3D-structural information.<\/jats:p>\n               <jats:p>Availability: http:\/\/cubic.bioc.columbia.edu\/services\/disis<\/jats:p>\n               <jats:p>Contact: yo135@columbia.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm174","type":"journal-article","created":{"date-parts":[[2007,7,23]],"date-time":"2007-07-23T16:13:46Z","timestamp":1185207226000},"page":"i347-i353","source":"Crossref","is-referenced-by-count":145,"title":["Prediction of DNA-binding residues from sequence"],"prefix":"10.1093","volume":"23","author":[{"given":"Yanay","family":"Ofran","sequence":"first","affiliation":[{"name":"1 Department of Biochemistry and Molecular Biophysics, Columbia University, 630 West 168th Street, 2Columbia University Center for Computational Biology and Bioinformatics (C2B2), 1130\u2009St Nicholas Ave. Rm. 802, 3D.E.Shaw Research, 120 West Forty Fifth Street (current affiliation) and 4NorthEast Structural Genomics Consortium (NESG), Columbia University, 1130\u2009St Nicholas Ave. Rm. 802, New York, NY 10032, USA"},{"name":"1 Department of Biochemistry and Molecular Biophysics, Columbia University, 630 West 168th Street, 2Columbia University Center for Computational Biology and Bioinformatics (C2B2), 1130\u2009St Nicholas Ave. Rm. 802, 3D.E.Shaw Research, 120 West Forty Fifth Street (current affiliation) and 4NorthEast Structural Genomics Consortium (NESG), Columbia University, 1130\u2009St Nicholas Ave. Rm. 802, New York, NY 10032, USA"}]},{"given":"Venkatesh","family":"Mysore","sequence":"additional","affiliation":[{"name":"1 Department of Biochemistry and Molecular Biophysics, Columbia University, 630 West 168th Street, 2Columbia University Center for Computational Biology and Bioinformatics (C2B2), 1130\u2009St Nicholas Ave. Rm. 802, 3D.E.Shaw Research, 120 West Forty Fifth Street (current affiliation) and 4NorthEast Structural Genomics Consortium (NESG), Columbia University, 1130\u2009St Nicholas Ave. Rm. 802, New York, NY 10032, USA"},{"name":"1 Department of Biochemistry and Molecular Biophysics, Columbia University, 630 West 168th Street, 2Columbia University Center for Computational Biology and Bioinformatics (C2B2), 1130\u2009St Nicholas Ave. Rm. 802, 3D.E.Shaw Research, 120 West Forty Fifth Street (current affiliation) and 4NorthEast Structural Genomics Consortium (NESG), Columbia University, 1130\u2009St Nicholas Ave. Rm. 802, New York, NY 10032, USA"}]},{"given":"Burkhard","family":"Rost","sequence":"additional","affiliation":[{"name":"1 Department of Biochemistry and Molecular Biophysics, Columbia University, 630 West 168th Street, 2Columbia University Center for Computational Biology and Bioinformatics (C2B2), 1130\u2009St Nicholas Ave. Rm. 802, 3D.E.Shaw Research, 120 West Forty Fifth Street (current affiliation) and 4NorthEast Structural Genomics Consortium (NESG), Columbia University, 1130\u2009St Nicholas Ave. Rm. 802, New York, NY 10032, USA"},{"name":"1 Department of Biochemistry and Molecular Biophysics, Columbia University, 630 West 168th Street, 2Columbia University Center for Computational Biology and Bioinformatics (C2B2), 1130\u2009St Nicholas Ave. Rm. 802, 3D.E.Shaw Research, 120 West Forty Fifth Street (current affiliation) and 4NorthEast Structural Genomics Consortium (NESG), Columbia University, 1130\u2009St Nicholas Ave. Rm. 802, New York, NY 10032, USA"},{"name":"1 Department of Biochemistry and Molecular Biophysics, Columbia University, 630 West 168th Street, 2Columbia University Center for Computational Biology and Bioinformatics (C2B2), 1130\u2009St Nicholas Ave. Rm. 802, 3D.E.Shaw Research, 120 West Forty Fifth Street (current affiliation) and 4NorthEast Structural Genomics Consortium (NESG), Columbia University, 1130\u2009St Nicholas Ave. Rm. 802, New York, NY 10032, USA"}]}],"member":"286","published-online":{"date-parts":[[2007,7,1]]},"reference":[{"key":"2023062708450755100_B1","doi-asserted-by":"crossref","first-page":"477","DOI":"10.1093\/bioinformatics\/btg432","article-title":"Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information","volume":"20","author":"Ahmad","year":"2004","journal-title":"Bioinformatics"},{"key":"2023062708450755100_B2","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1016\/j.jmb.2004.05.058","article-title":"Moment-based prediction of DNA-binding proteins","volume":"341","author":"Ahmad","year":"2004","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B3","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023062708450755100_B4","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023062708450755100_B5","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1016\/S0076-6879(06)10013-0","article-title":"Analysis of sequence specificities of DNA-binding proteins with protein binding microarrays","volume":"410","author":"Bulyk","year":"2006","journal-title":"Methods Enzymol"},{"key":"2023062708450755100_B6","doi-asserted-by":"crossref","first-page":"630","DOI":"10.1002\/prot.20741","article-title":"Exploiting sequence and structure homologs to identify protein-protein binding sites","volume":"62","author":"Chung","year":"2006","journal-title":"Proteins"},{"key":"2023062708450755100_B7","doi-asserted-by":"crossref","first-page":"1356","DOI":"10.1046\/j.1432-1033.2002.02767.x","article-title":"Prediction of protein\u2013protein interaction sites in heterocomplexes with neural networks","volume":"269","author":"Fariselli","year":"2002","journal-title":"Eur. J. Biochem"},{"key":"2023062708450755100_B8","doi-asserted-by":"crossref","first-page":"843","DOI":"10.1016\/j.jmb.2003.10.069","article-title":"Identification of protein-protein interaction sites from docking energy landscapes","volume":"335","author":"Fernandez-Recio","year":"2004","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B9","doi-asserted-by":"crossref","first-page":"759","DOI":"10.1093\/bioinformatics\/15.9.759","article-title":"Finding families for genomic ORFans","volume":"15","author":"Fischer","year":"1999","journal-title":"Bioinformatics"},{"key":"2023062708450755100_B10","article-title":"Making large-scale SVM learning practical","volume-title":"Advances in Kernel Methods - Support Vector Learning","author":"Joachims","year":"1999"},{"key":"2023062708450755100_B11","doi-asserted-by":"crossref","first-page":"7189","DOI":"10.1093\/nar\/gkg922","article-title":"Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins","volume":"31","author":"Jones","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023062708450755100_B12","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1006\/jmbi.1997.1234","article-title":"Analysis of protein-protein interaction sites using surface patches","volume":"272","author":"Jones","year":"1997","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B13","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1006\/jmbi.1997.1233","article-title":"Prediction of protein-protein interaction sites using patch analysis","volume":"272","author":"Jones","year":"1997","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B14","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.cbpa.2003.11.001","article-title":"Searching for functional sites in protein structures","volume":"8","author":"Jones","year":"2004","journal-title":"Curr. Opin. Chem. Biol"},{"key":"2023062708450755100_B15","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1002\/jcc.10361","article-title":"Pattern recognition strategies for molecular surfaces: III. Binding site prediction with a neural network","volume":"25","author":"Keil","year":"2004","journal-title":"J. Comput. Chem"},{"key":"2023062708450755100_B16","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1093\/protein\/gzh020","article-title":"Prediction of protein-protein interaction sites using support vector machines","volume":"17","author":"Koike","year":"2004","journal-title":"Protein Eng. Des. Sel"},{"key":"2023062708450755100_B17","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1002\/prot.20977","article-title":"Using evolutionary and structural information to predict DNA-binding sites on DNA-binding proteins","volume":"64","author":"Kuznetsov","year":"2006","journal-title":"Proteins"},{"key":"2023062708450755100_B18","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1002\/prot.20607","article-title":"Protein-nucleic acid recognition: statistical analysis of atomic interactions and influence of DNA structure","volume":"61","author":"Lejeune","year":"2005","journal-title":"Proteins"},{"key":"2023062708450755100_B19","doi-asserted-by":"crossref","first-page":"1970","DOI":"10.1110\/ps.10101","article-title":"Comparing function and structure between entire proteomes","volume":"10","author":"Liu","year":"2001","journal-title":"Protein Sci"},{"key":"2023062708450755100_B20","doi-asserted-by":"crossref","first-page":"922","DOI":"10.1093\/bioinformatics\/18.7.922","article-title":"Target space for structural genomics revisited","volume":"18","author":"Liu","year":"2002","journal-title":"Bioinformatics"},{"key":"2023062708450755100_B21","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1002\/prot.20012","article-title":"Automatic target selection for structural genomics on eukaryotes","volume":"56","author":"Liu","year":"2004","journal-title":"Proteins: Structure, Function, and Bioinformatics"},{"key":"2023062708450755100_B22","doi-asserted-by":"crossref","first-page":"2177","DOI":"10.1006\/jmbi.1998.2439","article-title":"The atomic structure of protein-protein recognition sites","volume":"285","author":"Lo Conte","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B23","doi-asserted-by":"crossref","first-page":"2306","DOI":"10.1093\/nar\/26.10.2306","article-title":"Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites","volume":"26","author":"Mandel-Gutfreund","year":"1998","journal-title":"Nucleic Acids Res"},{"key":"2023062708450755100_B24","doi-asserted-by":"crossref","first-page":"370","DOI":"10.1006\/jmbi.1995.0559","article-title":"Comprehensive analysis of hydrogen bonds in regulatory protein DNA-complexes: in search of common principles","volume":"253","author":"Mandel-Gutfreund","year":"1995","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B25","doi-asserted-by":"crossref","first-page":"3789","DOI":"10.1093\/nar\/gkg620","article-title":"UniqueProt: creating representative protein sequence sets","volume":"31","author":"Mika","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023062708450755100_B26","doi-asserted-by":"crossref","first-page":"1999","DOI":"10.1021\/bi982362d","article-title":"Structural features of protein-nucleic acid recognition sites","volume":"38","author":"Nadassy","year":"1999","journal-title":"Biochemistry"},{"key":"2023062708450755100_B27","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1016\/j.jmb.2004.02.040","article-title":"ProMate: a structure based prediction program to identify the location of protein-protein binding sites","volume":"338","author":"Neuvirth","year":"2004","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B28","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1016\/S0022-2836(02)01223-8","article-title":"Analysing six types of protein-protein interfaces","volume":"325","author":"Ofran","year":"2003","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B29","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/S0014-5793(03)00456-3","article-title":"Predicted protein-protein interaction sites from local sequence information","volume":"544","author":"Ofran","year":"2003","journal-title":"FEBS Lett"},{"issue":"2","key":"2023062708450755100_B30","doi-asserted-by":"crossref","first-page":"e13","DOI":"10.1093\/bioinformatics\/btl303","article-title":"ISIS: Interaction Sites Identified from Sequence","volume":"23","author":"Ofran","year":"2006","journal-title":"Bioinformatics"},{"key":"2023062708450755100_B31","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1146\/annurev.bi.53.070184.001453","article-title":"Protein-DNA recognition","volume":"53","author":"Pabo","year":"1984","journal-title":"Annu. Rev. Biochem"},{"key":"2023062708450755100_B32","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1002\/prot.10029","article-title":"Alignments grow, secondary structure prediction improves","volume":"46","author":"Przybylski","year":"2002","journal-title":"Proteins"},{"key":"2023062708450755100_B33","doi-asserted-by":"crossref","first-page":"2496","DOI":"10.1093\/bioinformatics\/bti340","article-title":"An evolution based classifier for prediction of protein interfaces without using protein structures","volume":"21","author":"Res","year":"2005","journal-title":"Bioinformatics"},{"key":"2023062708450755100_B34","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1016\/0301-4622(74)80050-5","article-title":"Diffusion controlled reaction rates in spheroidal geometry. Application to repressor\u2013operator association and membrane bound enzymes","volume":"2","author":"Richter","year":"1974","journal-title":"Biophys. Chem"},{"key":"2023062708450755100_B35","doi-asserted-by":"crossref","first-page":"E42","DOI":"10.1371\/journal.pbio.0020042","article-title":"Identifying protein function\u2013a call for community action","volume":"2","author":"Roberts","year":"2004","journal-title":"PLoS Biol"},{"key":"2023062708450755100_B36","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1016\/S0076-6879(96)66033-9","article-title":"PHD: predicting one-dimensional protein structure by profile based neural networks","volume":"266","author":"Rost","year":"1996","journal-title":"Method. Enzymol"},{"key":"2023062708450755100_B37","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1093\/protein\/12.2.85","article-title":"Twilight zone of protein sequence alignments","volume":"12","author":"Rost","year":"1999","journal-title":"Protein Eng"},{"key":"2023062708450755100_B38","doi-asserted-by":"crossref","first-page":"584","DOI":"10.1006\/jmbi.1993.1413","article-title":"Prediction of protein secondary structure at better than 70% accuracy","volume":"232","author":"Rost","year":"1993","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B39","doi-asserted-by":"crossref","first-page":"W321","DOI":"10.1093\/nar\/gkh377","article-title":"The PredictProtein server","volume":"32","author":"Rost","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023062708450755100_B40","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1002\/prot.340090107","article-title":"Database of homology-derived protein structures and the structural meaning of sequence alignment","volume":"9","author":"Sander","year":"1991","journal-title":"Proteins"},{"key":"2023062708450755100_B41","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1146\/annurev.biophys.34.040204.144537","article-title":"Protein-DNA recognition patterns and predictions","volume":"34","author":"Sarai","year":"2005","journal-title":"Annu. Rev. Biophys. Biomol. Struct"},{"key":"2023062708450755100_B42","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1093\/nar\/24.1.201","article-title":"The HSSP database of protein structure-sequence alignments","volume":"24","author":"Schneider","year":"1996","journal-title":"Nucleic Acids Res"},{"key":"2023062708450755100_B43","doi-asserted-by":"crossref","first-page":"4732","DOI":"10.1093\/nar\/gkh803","article-title":"Identifying DNA-binding proteins using structural motifs and the electrostatic potential","volume":"32","author":"Shanahan","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023062708450755100_B44","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1016\/S0959-440X(00)00065-8","article-title":"Electrostatic aspects of protein-protein interactions","volume":"10","author":"Sheinerman","year":"2000","journal-title":"Curr. Opin. Struct Biol"},{"issue":"4","key":"2023062708450755100_B45","doi-asserted-by":"crossref","first-page":"1085","DOI":"10.1093\/nar\/gkl1155","article-title":"Structure-based prediction of C2H2 zinc-finger binding specificity: sensitivity to docking geometry","volume":"35","author":"Siggers","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023062708450755100_B46","doi-asserted-by":"crossref","first-page":"1065","DOI":"10.1016\/S0022-2836(03)00031-7","article-title":"Annotating nucleic acid-binding function based on protein structure","volume":"326","author":"Stawiski","year":"2003","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B47","doi-asserted-by":"crossref","first-page":"922","DOI":"10.1016\/j.jmb.2006.02.053","article-title":"Efficient prediction of nucleic acid binding function from low-resolution protein structures","volume":"358","author":"Szilagyi","year":"2006","journal-title":"J. Mol. Biol"},{"key":"2023062708450755100_B48","doi-asserted-by":"crossref","first-page":"885","DOI":"10.1002\/prot.20111","article-title":"Structure-based prediction of DNA-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces","volume":"55","author":"Tsuchiya","year":"2004","journal-title":"Proteins"},{"key":"2023062708450755100_B49","doi-asserted-by":"crossref","first-page":"1721","DOI":"10.1093\/bioinformatics\/bti232","article-title":"PreDs: a server for predicting dsDNA-binding site on protein molecular surfaces","volume":"21","author":"Tsuchiya","year":"2005","journal-title":"Bioinformatics"},{"key":"2023062708450755100_B50","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-2440-0","volume-title":"The nature of statistical learning theory","author":"Vapnik","year":"1995"},{"key":"2023062708450755100_B51","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1016\/S0021-9258(19)84994-3","article-title":"Facilitated target location in biological systems","volume":"264","author":"von Hippel","year":"1989","journal-title":"J. Biol. Chem"},{"issue":"2","key":"2023062708450755100_B52","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1016\/j.febslet.2005.11.081","article-title":"Predicting protein interaction sites from residue spatial sequence profile and evolution rate","volume":"580","author":"Wang","year":"2005","journal-title":"FEBS Lett"},{"key":"2023062708450755100_B53","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1186\/1471-2105-7-262","article-title":"Predicting DNA-binding sites of proteins from amino acid sequence","volume":"7","author":"Yan","year":"2006","journal-title":"BMC Bioinformat"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/13\/i347\/50714894\/bioinformatics_23_13_i347.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/13\/i347\/50714894\/bioinformatics_23_13_i347.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T08:48:26Z","timestamp":1687855706000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/13\/i347\/227121"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,7,1]]},"references-count":53,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2007,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm174","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,7]]},"published":{"date-parts":[[2007,7,1]]}}}