{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,14]],"date-time":"2026-01-14T15:43:23Z","timestamp":1768405403992,"version":"3.49.0"},"reference-count":46,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T00:00:00Z","timestamp":1684195200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["32271294"],"award-info":[{"award-number":["32271294"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["31971180"],"award-info":[{"award-number":["31971180"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,7,20]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Protein\u2013deoxyribonucleic acid (DNA) interactions are important in a variety of biological processes. Accurately predicting protein-DNA binding affinity has been one of the most attractive and challenging issues in computational biology. However, the existing approaches still have much room for improvement. In this work, we propose an ensemble model for Protein-DNA Binding Affinity prediction (emPDBA), which combines six base models with one meta-model. The complexes are classified into four types based on the DNA structure (double-stranded or other forms) and the percentage of interface residues. For each type, emPDBA is trained with the sequence-based, structure-based and energy features from binding partners and complex structures. Through feature selection by the sequential forward selection method, it is found that there do exist considerable differences in the key factors contributing to intermolecular binding affinity. The complex classification is beneficial for the important feature extraction for binding affinity prediction. The performance comparison of our method with other peer ones on the independent testing dataset shows that emPDBA outperforms the state-of-the-art methods with the Pearson correlation coefficient of 0.53 and the mean absolute error of 1.11\u00a0kcal\/mol. The comprehensive results demonstrate that our method has a good performance for protein-DNA binding affinity prediction.<\/jats:p>\n               <jats:p>Availability and implementation: The source code is available at https:\/\/github.com\/ChunhuaLiLab\/emPDBA\/.<\/jats:p>","DOI":"10.1093\/bib\/bbad192","type":"journal-article","created":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T02:54:27Z","timestamp":1684292067000},"source":"Crossref","is-referenced-by-count":10,"title":["emPDBA: protein-DNA binding affinity prediction by combining features from binding partners and interface learned with ensemble regression model"],"prefix":"10.1093","volume":"24","author":[{"given":"Shuang","family":"Yang","sequence":"first","affiliation":[{"name":"Faculty of Environmental and Life Sciences, Beijing University of Technology , Beijing 100124 , China"}]},{"given":"Weikang","family":"Gong","sequence":"additional","affiliation":[{"name":"Faculty of Environmental and Life Sciences, Beijing University of Technology , Beijing 100124 , China"}]},{"given":"Tong","family":"Zhou","sequence":"additional","affiliation":[{"name":"Faculty of Environmental and Life Sciences, Beijing University of Technology , Beijing 100124 , China"}]},{"given":"Xiaohan","family":"Sun","sequence":"additional","affiliation":[{"name":"Faculty of Environmental and Life Sciences, Beijing University of Technology , Beijing 100124 , China"}]},{"given":"Lei","family":"Chen","sequence":"additional","affiliation":[{"name":"Faculty of Environmental and Life Sciences, Beijing University of Technology , Beijing 100124 , China"}]},{"given":"Wenxue","family":"Zhou","sequence":"additional","affiliation":[{"name":"Faculty of Environmental and Life Sciences, Beijing University of Technology , Beijing 100124 , China"}]},{"given":"Chunhua","family":"Li","sequence":"additional","affiliation":[{"name":"Faculty of Environmental and Life Sciences, Beijing University of Technology , Beijing 100124 , China"}]}],"member":"286","published-online":{"date-parts":[[2023,5,16]]},"reference":[{"issue":"1","key":"2023072020055262800_ref1","doi-asserted-by":"crossref","first-page":"reviews001.1","DOI":"10.1186\/gb-2000-1-1-reviews001","article-title":"An overview of the structures of protein-DNA complexes","volume":"1","author":"Luscombe","year":"2000","journal-title":"Genome Biol"},{"issue":"4","key":"2023072020055262800_ref2","doi-asserted-by":"crossref","first-page":"1349","DOI":"10.1534\/genetics.115.178384","article-title":"A biophysical approach to predicting protein-DNA binding energetics","volume":"200","author":"Locke","year":"2015","journal-title":"Genetics"},{"issue":"8","key":"2023072020055262800_ref3","doi-asserted-by":"crossref","first-page":"1849","DOI":"10.1038\/nprot.2007.249","article-title":"Electrophoretic mobility shift assay (EMSA) for detecting protein-nucleic acid interactions","volume":"2","author":"Hellman","year":"2007","journal-title":"Nat Protoc"},{"key":"2023072020055262800_ref4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/978-1-60327-015-1_1","article-title":"Filter-binding assays","volume":"543","author":"Stockley","year":"2009","journal-title":"Methods Mol Biol"},{"key":"2023072020055262800_ref5","first-page":"65","article-title":"Fluorescence spectroscopy","volume":"40","author":"Royer","year":"1995","journal-title":"Methods Mol Biol"},{"issue":"1","key":"2023072020055262800_ref6","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1038\/nprot.2006.28","article-title":"Isothermal titration calorimetry to determine association constants for high-affinity ligands","volume":"47","author":"Velazquez-Campoy","year":"2006","journal-title":"Nat Protoc"},{"key":"2023072020055262800_ref7","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1007\/978-1-61779-974-7_24","article-title":"Measuring antibody-antigen binding kinetics using surface plasmon resonance","volume":"907","author":"Hearty","year":"2012","journal-title":"Methods Mol Biol"},{"issue":"8","key":"2023072020055262800_ref8","doi-asserted-by":"crossref","first-page":"1420","DOI":"10.1063\/1.1740409","article-title":"High-temperature equation of state by a perturbation method I nonpolar gases","volume":"22","author":"Zwanzig","year":"1954","journal-title":"J Chem Phys"},{"issue":"11","key":"2023072020055262800_ref9","doi-asserted-by":"crossref","first-page":"6720","DOI":"10.1063\/1.451846","article-title":"Free energy of hydrophobic hydration: a molecular dynamics study of noble gases in water","volume":"85","author":"Straatsma","year":"1986","journal-title":"J Chem Phys"},{"issue":"12","key":"2023072020055262800_ref10","doi-asserted-by":"crossref","first-page":"889","DOI":"10.1021\/ar000033j","article-title":"Calculating structures and free energies of complex molecules: combining molecular mechanics and continuum models","volume":"33","author":"Kollman","year":"2000","journal-title":"Acc Chem Res"},{"issue":"8","key":"2023072020055262800_ref11","doi-asserted-by":"crossref","first-page":"1656","DOI":"10.1021\/ci8001167","article-title":"MedusaScore: an accurate force field-based scoring function for virtual drug screening","volume":"48","author":"Yin","year":"2008","journal-title":"J Chem Inf Model"},{"issue":"7","key":"2023072020055262800_ref12","doi-asserted-by":"crossref","first-page":"2325","DOI":"10.1021\/jm049314d","article-title":"A knowledge-based energy function for protein-ligand, protein-protein, and protein-DNA complexes","volume":"48","author":"Zhang","year":"2005","journal-title":"J Med Chem"},{"issue":"11","key":"2023072020055262800_ref13","doi-asserted-by":"crossref","first-page":"2714","DOI":"10.1110\/ps.0217002","article-title":"Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction","volume":"11","author":"Zhou","year":"2002","journal-title":"Protein Sci"},{"issue":"10","key":"2023072020055262800_ref14","doi-asserted-by":"crossref","first-page":"1990","DOI":"10.1021\/ci800125k","article-title":"Information theory-based scoring function for the structure-based prediction of protein-ligand binding affinity","volume":"48","author":"Kulharia","year":"2008","journal-title":"J Chem Inf Model"},{"issue":"12","key":"2023072020055262800_ref15","doi-asserted-by":"crossref","first-page":"1628","DOI":"10.1261\/rna.071779.119","article-title":"A structure-based model for the prediction of protein-RNA binding affinity","volume":"25","author":"Nithin","year":"2019","journal-title":"RNA"},{"issue":"1","key":"2023072020055262800_ref16","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1002\/prot.24946","article-title":"High-resolution crystal structures leverage protein binding affinity predictions","volume":"84","author":"Marillet","year":"2016","journal-title":"Proteins"},{"key":"2023072020055262800_ref17","doi-asserted-by":"crossref","first-page":"e07454","DOI":"10.7554\/eLife.07454","article-title":"Contacts-based prediction of binding affinity in protein-protein complexes","volume":"4","author":"Vangone","year":"2015","journal-title":"Elife"},{"key":"2023072020055262800_ref18","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1007\/978-1-4939-9752-7_16","article-title":"Machine learning to predict binding affinity","volume":"2053","author":"Bitencourt-Ferreira","year":"2019","journal-title":"Methods Mol Biol"},{"issue":"23","key":"2023072020055262800_ref19","doi-asserted-by":"crossref","first-page":"2459","DOI":"10.2174\/0929867324666170623092503","article-title":"Supervised machine learning methods applied to predict ligand- binding affinity","volume":"24","author":"Heck","year":"2017","journal-title":"Curr Med Chem"},{"issue":"6","key":"2023072020055262800_ref20","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1002\/wcms.1225","article-title":"Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening","volume":"5","author":"Ain","year":"2015","journal-title":"Wiley Interdiscip Rev Comput Mol Sci"},{"issue":"12","key":"2023072020055262800_ref21","doi-asserted-by":"crossref","first-page":"4111","DOI":"10.1021\/jm048957q","article-title":"The PDBbind database: methodologies and updates","volume":"48","author":"Wang","year":"2005","journal-title":"J Med Chem"},{"issue":"15","key":"2023072020055262800_ref22","doi-asserted-by":"crossref","first-page":"1857","DOI":"10.1093\/bioinformatics\/btq295","article-title":"Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function","volume":"26","author":"Zhao","year":"2010","journal-title":"Bioinformatics"},{"issue":"1","key":"2023072020055262800_ref23","doi-asserted-by":"crossref","first-page":"1278","DOI":"10.1038\/s41598-020-57778-1","article-title":"PreDBA: a heterogeneous ensemble approach for predicting protein-DNA binding affinity","volume":"10","author":"Yang","year":"2020","journal-title":"Sci Rep"},{"key":"2023072020055262800_ref24","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1186\/1471-2105-11-262","article-title":"The protein-DNA Interface database","volume":"11","author":"Norambuena","year":"2010","journal-title":"BMC Bioinform"},{"issue":"4","key":"2023072020055262800_ref25","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1007\/s00214-017-2083-1","article-title":"Feature functional theory-binding predictor (FFT-BP) for the blind prediction of binding free energies","volume":"136","author":"Wang","year":"2017","journal-title":"Theor Chem Accounts"},{"issue":"D1","key":"2023072020055262800_ref26","doi-asserted-by":"crossref","first-page":"D1528","DOI":"10.1093\/nar\/gkab848","article-title":"ProNAB: database for binding affinities of protein-nucleic acid complexes and their mutants","volume":"50","author":"Harini","year":"2022","journal-title":"Nucleic Acids Res"},{"issue":"13","key":"2023072020055262800_ref27","doi-asserted-by":"crossref","first-page":"1658","DOI":"10.1093\/bioinformatics\/btl158","article-title":"Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences","volume":"22","author":"Li","year":"2006","journal-title":"Bioinformatics"},{"issue":"7","key":"2023072020055262800_ref28","doi-asserted-by":"crossref","first-page":"4046","DOI":"10.1073\/pnas.78.7.4046","article-title":"On the attribution and additivity of binding energies","volume":"78","author":"Jencks","year":"1981","journal-title":"Proc Natl Acad Sci USA"},{"key":"2023072020055262800_ref29","doi-asserted-by":"crossref","first-page":"510","DOI":"10.1002\/pro.2230","article-title":"Protein-protein interactions: general trends in the relationship between binding affinity and interfacial buried surface area","volume":"22","author":"Chen","year":"2013","journal-title":"Protein Sci"},{"issue":"24","key":"2023072020055262800_ref30","doi-asserted-by":"crossref","first-page":"3583","DOI":"10.1093\/bioinformatics\/btu580","article-title":"Protein-protein binding affinity prediction from amino acid sequence","volume":"30","author":"Yugandhar","year":"2014","journal-title":"Bioinformatics"},{"issue":"7","key":"2023072020055262800_ref31","doi-asserted-by":"crossref","first-page":"937","DOI":"10.1093\/bioinformatics\/btaa747","article-title":"aPRBind: protein-RNA interface prediction by combining sequence and I-TASSER model-based structural features learned with convolutional neural networks","volume":"37","author":"Liu","year":"2021","journal-title":"Bioinformatics"},{"issue":"12","key":"2023072020055262800_ref32","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features","volume":"22","author":"Kabsch","year":"1983","journal-title":"Biopolymers"},{"key":"2023072020055262800_ref33","volume-title":"NACCESS, Computer Program","author":"Hubbard","year":"1993"},{"issue":"1","key":"2023072020055262800_ref34","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/0263-7855(96)00018-5","article-title":"VMD: visual molecular dynamics","volume":"14","author":"Humphrey","year":"1996","journal-title":"J Mol Graph"},{"issue":"1","key":"2023072020055262800_ref35","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1016\/S0022-2836(03)00670-3","article-title":"Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations","volume":"331","author":"Gray","year":"2003","journal-title":"J Mol Biol"},{"issue":"19","key":"2023072020055262800_ref36","doi-asserted-by":"crossref","first-page":"10383","DOI":"10.1073\/pnas.97.19.10383","article-title":"Native protein sequences are close to optimal for their structures","volume":"97","author":"Kuhlman","year":"2000","journal-title":"Proc Natl Acad Sci USA"},{"issue":"Web Server issue","key":"2023072020055262800_ref37","doi-asserted-by":"crossref","first-page":"W233","DOI":"10.1093\/nar\/gkn216","article-title":"The RosettaDock server for local protein-protein docking","volume":"36","author":"Lyskov","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023072020055262800_ref38","doi-asserted-by":"crossref","first-page":"1902","DOI":"10.1063\/1.472061","article-title":"Simulation of activation free energies in molecular systems","volume":"105","author":"Neria","year":"1996","journal-title":"J Chem Phys"},{"key":"2023072020055262800_ref39","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1016\/S0004-3702(97)00043-X","article-title":"Wrappers for feature subset selection","volume":"97","author":"Kohavi","year":"1997","journal-title":"Artif Intell"},{"key":"2023072020055262800_ref40","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1016\/S0167-9473(01)00065-2","article-title":"Stochastic gradient boosting","volume":"38","author":"Friedman","year":"2002","journal-title":"Comput Stat Data Anal"},{"key":"2023072020055262800_ref41","volume-title":"Expert Systems in the Micro-electronic Age","author":"Quinlan","year":".1979"},{"key":"2023072020055262800_ref42","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach Learn"},{"key":"2023072020055262800_ref43","first-page":"148","volume-title":"Proceedings of the 13th Conference on Machine Learning","author":"Freund","year":"1996"},{"key":"2023072020055262800_ref44","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach Learn"},{"key":"2023072020055262800_ref45","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1145\/2939672.2939785","volume-title":"Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining","author":"Chen","year":"2016"},{"issue":"4","key":"2023072020055262800_ref46","doi-asserted-by":"crossref","first-page":"e2692","DOI":"10.1002\/jmr.2692","article-title":"Dissecting and analyzing key residues in protein-DNA complexes","volume":"31","author":"Kulandaisamy","year":"2018","journal-title":"J Mol Recognit"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/24\/4\/bbad192\/50916891\/bbad192.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/24\/4\/bbad192\/50916891\/bbad192.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,20]],"date-time":"2023-07-20T20:07:45Z","timestamp":1689883665000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbad192\/7165253"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,16]]},"references-count":46,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,7,20]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbad192","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,7]]},"published":{"date-parts":[[2023,5,16]]},"article-number":"bbad192"}}