{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T11:11:43Z","timestamp":1772190703095,"version":"3.50.1"},"reference-count":47,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2024,7,8]],"date-time":"2024-07-08T00:00:00Z","timestamp":1720396800000},"content-version":"vor","delay-in-days":46,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62162015"],"award-info":[{"award-number":["62162015"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61762026"],"award-info":[{"award-number":["61762026"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100012547","name":"Guangxi Natural Science Foundation","doi-asserted-by":"publisher","award":["2023GXNSFAA026054"],"award-info":[{"award-number":["2023GXNSFAA026054"]}],"id":[{"id":"10.13039\/100012547","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Innovation Project of GUET Graduate Education","award":["2024YCXS049"],"award-info":[{"award-number":["2024YCXS049"]}]},{"name":"Innovation Project of GUET Graduate Education","award":["2024YCXB12"],"award-info":[{"award-number":["2024YCXB12"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,5,23]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Mechanisms of protein-DNA interactions are involved in a wide range of biological activities and processes. Accurately identifying binding sites between proteins and DNA is crucial for analyzing genetic material, exploring protein functions, and designing novel drugs. In recent years, several computational methods have been proposed as alternatives to time-consuming and expensive traditional experiments. However, accurately predicting protein-DNA binding sites still remains a challenge. Existing computational methods often rely on handcrafted features and a single-model architecture, leaving room for improvement. We propose a novel computational method, called EGPDI, based on multi-view graph embedding fusion. This approach involves the integration of Equivariant Graph Neural Networks (EGNN) and Graph Convolutional Networks II (GCNII), independently configured to profoundly mine the global and local node embedding representations. An advanced gated multi-head attention mechanism is subsequently employed to capture the attention weights of the dual embedding representations, thereby facilitating the integration of node features. Besides, extra node features from protein language models are introduced to provide more structural information. To our knowledge, this is the first time that multi-view graph embedding fusion has been applied to the task of protein\u2013DNA binding site prediction. The results of five-fold cross-validation and independent testing demonstrate that EGPDI outperforms state-of-the-art methods. Further comparative experiments and case studies also verify the superiority and generalization ability of EGPDI.<\/jats:p>","DOI":"10.1093\/bib\/bbae330","type":"journal-article","created":{"date-parts":[[2024,7,8]],"date-time":"2024-07-08T12:46:56Z","timestamp":1720442816000},"source":"Crossref","is-referenced-by-count":16,"title":["EGPDI: identifying protein\u2013DNA binding sites based on multi-view graph embedding fusion"],"prefix":"10.1093","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-3777-2641","authenticated-orcid":false,"given":"Mengxin","family":"Zheng","sequence":"first","affiliation":[{"name":"School of Computer Science and Information Security, Guilin University of Electronic Technology , Guilin 541004 , China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4597-0268","authenticated-orcid":false,"given":"Guicong","family":"Sun","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Security, Guilin University of Electronic Technology , Guilin 541004 , China"}]},{"given":"Xueping","family":"Li","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Security, Guilin University of Electronic Technology , Guilin 541004 , China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0120-8092","authenticated-orcid":false,"given":"Yongxian","family":"Fan","sequence":"additional","affiliation":[{"name":"School of Computer Science and Information Security, Guilin University of Electronic Technology , Guilin 541004 , China"}]}],"member":"286","published-online":{"date-parts":[[2024,7,8]]},"reference":[{"key":"2024070812372455300_ref1","doi-asserted-by":"crossref","first-page":"1857","DOI":"10.1093\/bioinformatics\/btq295","article-title":"Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function","volume":"26","author":"Zhao","year":"2010","journal-title":"Bioinformatics"},{"key":"2024070812372455300_ref2","doi-asserted-by":"crossref","first-page":"7364","DOI":"10.1093\/nar\/gkq617","article-title":"Genomic repertoires of DNA-binding transcription factors across the tree of life","volume":"38","author":"Charoensawan","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2024070812372455300_ref3","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1038\/nrg2845","article-title":"Determining the specificity of protein\u2013DNA interactions","volume":"11","author":"Stormo","year":"2010","journal-title":"Nat Rev Genet"},{"key":"2024070812372455300_ref4","doi-asserted-by":"crossref","first-page":"844","DOI":"10.1038\/s41564-022-01133-9","article-title":"Genome-wide protein\u2013DNA interaction site mapping in bacteria using a double-stranded DNA-specific cytosine deaminase","volume":"7","author":"Gallagher","year":"2022","journal-title":"Nat Microbiol"},{"key":"2024070812372455300_ref5","doi-asserted-by":"crossref","first-page":"1058","DOI":"10.1016\/j.febslet.2007.01.086","article-title":"Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions","volume":"581","author":"Bhardwaj","year":"2007","journal-title":"FEBS Lett"},{"key":"2024070812372455300_ref6","doi-asserted-by":"crossref","first-page":"e1003341","DOI":"10.1371\/journal.pcbi.1003341","article-title":"Structure-based function prediction of uncharacterized protein using binding sites comparison","volume":"9","author":"Konc","year":"2013","journal-title":"PLoS Comput Biol"},{"key":"2024070812372455300_ref7","doi-asserted-by":"crossref","first-page":"5858","DOI":"10.1021\/jm100574m","article-title":"Understanding and predicting Druggability. A high-throughput method for detection of drug binding sites","volume":"53","author":"Schmidtke","year":"2010","journal-title":"J Med Chem"},{"key":"2024070812372455300_ref8","doi-asserted-by":"crossref","first-page":"3240","DOI":"10.1021\/acs.jcim.0c01494","article-title":"De novo molecule design through the molecular generative model conditioned by 3D information of protein binding sites","volume":"61","author":"Xu","year":"2021","journal-title":"J Chem Inf Model"},{"key":"2024070812372455300_ref9","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1016\/S0969-2126(97)00260-8","article-title":"CATH \u2013 a hierarchic classification of protein domain structures","volume":"5","author":"Orengo","year":"1997","journal-title":"Structure"},{"key":"2024070812372455300_ref10","doi-asserted-by":"crossref","first-page":"2306","DOI":"10.1093\/nar\/26.10.2306","article-title":"Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites","volume":"26","author":"Mandel-Gutfreund","year":"1998","journal-title":"Nucleic Acids Res"},{"key":"2024070812372455300_ref11","doi-asserted-by":"crossref","first-page":"e2202799119","DOI":"10.1073\/pnas.2202799119","article-title":"Cryo-EM structure of DNA-bound Smc5\/6 reveals DNA clamping enabled by multi-subunit conformational changes","volume":"119","author":"Yu","year":"2022","journal-title":"Proc Natl Acad Sci"},{"key":"2024070812372455300_ref12","doi-asserted-by":"crossref","first-page":"994","DOI":"10.1109\/TCBB.2013.104","article-title":"Designing template-free predictor for targeting protein-ligand binding sites with classifier ensemble and spatial clustering","volume":"10","author":"Yu","year":"2013","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2024070812372455300_ref13","doi-asserted-by":"crossref","first-page":"i343","DOI":"10.1093\/bioinformatics\/btz324","article-title":"SCRIBER: accurate and partner type-specific prediction of protein-binding residues from proteins sequences","volume":"35","author":"Zhang","year":"2019","journal-title":"Bioinformatics"},{"key":"2024070812372455300_ref14","doi-asserted-by":"crossref","first-page":"1389","DOI":"10.1109\/TCBB.2016.2616469","article-title":"Predicting protein-DNA binding residues by Weightedly combining sequence-based features and boosting multiple SVMs","volume":"14","author":"Hu","year":"2017","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2024070812372455300_ref15","doi-asserted-by":"crossref","first-page":"bbaa397","DOI":"10.1093\/bib\/bbaa397","article-title":"NCBRPred: predicting nucleic acid binding residues in proteins based on multilabel learning","volume":"22","author":"Zhang","year":"2021","journal-title":"Brief Bioinform"},{"key":"2024070812372455300_ref16","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1109\/AIIIP61647.2023.00022","volume-title":"2023 2nd International Conference on Artificial Intelligence and Intelligent Information Processing (AIIIP)","author":"Zhang","year":"2023"},{"key":"2024070812372455300_ref17","doi-asserted-by":"crossref","first-page":"W438","DOI":"10.1093\/nar\/gky439","article-title":"COACH-D: improved protein\u2013ligand binding sites prediction with refined ligand-binding poses through molecular docking","volume":"46","author":"Wu","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2024070812372455300_ref18","doi-asserted-by":"crossref","first-page":"2588","DOI":"10.1093\/bioinformatics\/btt447","article-title":"Protein\u2013ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment","volume":"29","author":"Yang","year":"2013","journal-title":"Bioinformatics"},{"key":"2024070812372455300_ref19","doi-asserted-by":"crossref","first-page":"W471","DOI":"10.1093\/nar\/gks372","article-title":"COFACTOR: an accurate comparative algorithm for structure-based protein function annotation","volume":"40","author":"Roy","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2024070812372455300_ref20","doi-asserted-by":"crossref","first-page":"e51","DOI":"10.1093\/nar\/gkab044","article-title":"GraphBind: protein structural context embedded rules learned by hierarchical graph neural networks for recognizing nucleic-acid-binding residues","volume":"49","author":"Xia","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2024070812372455300_ref21","doi-asserted-by":"crossref","first-page":"bbab564","DOI":"10.1093\/bib\/bbab564","article-title":"AlphaFold2-aware protein\u2013DNA binding site prediction using graph transformer","volume":"23","author":"Yuan","year":"2022","journal-title":"Brief Bioinform"},{"key":"2024070812372455300_ref22","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2024070812372455300_ref23","doi-asserted-by":"crossref","first-page":"bbad360","DOI":"10.1093\/bib\/bbad360","article-title":"Accurately identifying nucleic-acid-binding sites through geometric graph learning on language model predicted structures","volume":"24","author":"Song","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024070812372455300_ref24","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkae039","article-title":"EquiPNAS: improved protein\u2013nucleic acid binding site prediction using protein-language-model-informed equivariant deep graph neural networks","volume":"52","author":"Roche","year":"2024","journal-title":"Nucleic Acids Res"},{"key":"2024070812372455300_ref25","doi-asserted-by":"crossref","first-page":"1885","DOI":"10.1002\/prot.24330","article-title":"DNABind: a hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning- and template-based approaches: DNA-binding residue prediction","volume":"81","author":"Liu","year":"2013","journal-title":"Proteins"},{"key":"2024070812372455300_ref26","doi-asserted-by":"crossref","first-page":"930","DOI":"10.1093\/bioinformatics\/bty756","article-title":"Improving the prediction of protein\u2013nucleic acids binding residues via multiple sequence profiles and the consensus of complementary methods","volume":"35","author":"Su","year":"2019","journal-title":"Bioinformatics"},{"key":"2024070812372455300_ref27","doi-asserted-by":"crossref","first-page":"e1011428","DOI":"10.1371\/journal.pcbi.1011428","article-title":"Structure-based prediction of nucleic acid binding residues by merging deep learning- and template-based approaches","volume":"19","author":"Jiang","year":"2023","journal-title":"PLoS Comput Biol"},{"key":"2024070812372455300_ref28","doi-asserted-by":"crossref","first-page":"510","DOI":"10.1002\/prot.10221","article-title":"Data mining the protein data bank: residue interactions","volume":"49","author":"Oldfield","year":"2002","journal-title":"Proteins"},{"key":"2024070812372455300_ref29","doi-asserted-by":"crossref","first-page":"10086","DOI":"10.1093\/nar\/gku681","article-title":"Quantifying sequence and structural features of protein\u2013RNA interactions","volume":"42","author":"Li","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2024070812372455300_ref30","doi-asserted-by":"crossref","first-page":"2102","DOI":"10.1093\/bioinformatics\/btac020","article-title":"ProteinBERT: a universal deep-learning model of protein sequence and function","volume":"38","author":"Brandes","year":"2022","journal-title":"Bioinformatics"},{"key":"2024070812372455300_ref31","doi-asserted-by":"crossref","first-page":"1617","DOI":"10.1038\/s41587-022-01432-w","article-title":"Single-sequence protein structure prediction using a language model and deep learning","volume":"40","author":"Chowdhury","year":"2022","journal-title":"Nat Biotechnol"},{"key":"2024070812372455300_ref32","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"Rives","year":"2021","journal-title":"Proc Natl Acad Sci"},{"key":"2024070812372455300_ref33","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1126\/science.ade2574","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Lin","year":"2023","journal-title":"Science"},{"key":"2024070812372455300_ref34","article-title":"E(n) equivariant graph neural networks","volume-title":"International conference on machine learning","author":"Satorras","year":"2022"},{"key":"2024070812372455300_ref35","article-title":"FABind: Fast and accurate protein-ligand binding","volume-title":"Advances in Neural Information Processing Systems","author":"Pei","year":"2024"},{"key":"2024070812372455300_ref36","article-title":"Representation learning on biomolecular structures using Equivariant graph attention","author":"Le"},{"key":"2024070812372455300_ref37","doi-asserted-by":"crossref","first-page":"3901","DOI":"10.18653\/v1\/D18-1424","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Zhao","year":"2018"},{"key":"2024070812372455300_ref38","article-title":"Attention is all you need","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani"},{"key":"2024070812372455300_ref39","doi-asserted-by":"crossref","first-page":"680","DOI":"10.1093\/bioinformatics\/btq003","article-title":"CD-HIT suite: a web server for clustering and comparing biological sequences","volume":"26","author":"Huang","year":"2010","journal-title":"Bioinformatics"},{"key":"2024070812372455300_ref40","volume-title":"Semi-supervised classification with graph convolutional networks","author":"Kipf","year":"2016"},{"key":"2024070812372455300_ref41","article-title":"Simple and deep graph convolutional networks","author":"Chen"},{"key":"2024070812372455300_ref42","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1109\/TNN.2008.2005605","article-title":"The graph neural network model","volume":"20","author":"Scarselli","year":"2009","journal-title":"IEEE Trans Neural Netw"},{"key":"2024070812372455300_ref43","doi-asserted-by":"crossref","first-page":"2222","DOI":"10.1109\/TNNLS.2016.2582924","article-title":"LSTM: a search space odyssey","volume":"28","author":"Greff","year":"2017","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"2024070812372455300_ref44","volume-title":"Bidirectional LSTM-CRF Models for Sequence Tagging","author":"Huang","year":"2015"},{"key":"2024070812372455300_ref45","doi-asserted-by":"crossref","first-page":"1248","DOI":"10.1038\/nature08473","article-title":"The role of DNA shape in protein\u2013DNA recognition","volume":"461","author":"Rohs","year":"2009","journal-title":"Nature"},{"key":"2024070812372455300_ref46","doi-asserted-by":"crossref","first-page":"11883","DOI":"10.1093\/nar\/gky1057","article-title":"Flexibility and structure of flanking DNA impact transcription factor affinity for its core motif","volume":"46","author":"Yella","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2024070812372455300_ref47","doi-asserted-by":"crossref","first-page":"1147","DOI":"10.1002\/prot.25061","article-title":"Statistical analysis of structural determinants for protein\u2013DNA-binding specificity","volume":"84","author":"Corona","year":"2016","journal-title":"Proteins"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/4\/bbae330\/58469013\/bbae330.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/4\/bbae330\/58469013\/bbae330.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,8]],"date-time":"2024-07-08T12:47:32Z","timestamp":1720442852000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae330\/7709094"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,23]]},"references-count":47,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,5,23]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae330","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,7]]},"published":{"date-parts":[[2024,5,23]]},"article-number":"bbae330"}}