{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T21:13:11Z","timestamp":1777669991120,"version":"3.51.4"},"reference-count":66,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2022,1,17]],"date-time":"2022-01-17T00:00:00Z","timestamp":1642377600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Guangzhou S&T Research Plan","award":["202007030010"],"award-info":[{"award-number":["202007030010"]}]},{"name":"Guangzhou S&T Research Plan","award":["2016ZT06D211"],"award-info":[{"award-number":["2016ZT06D211"]}]},{"name":"Guangdong Key Field R&D Plan","award":["2018B010109006"],"award-info":[{"award-number":["2018B010109006"]}]},{"name":"Guangdong Key Field R&D Plan","award":["2019B020228001"],"award-info":[{"award-number":["2019B020228001"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62041209"],"award-info":[{"award-number":["62041209"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61772566"],"award-info":[{"award-number":["61772566"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["2020YFB0204803"],"award-info":[{"award-number":["2020YFB0204803"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,3,10]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Protein\u2013DNA interactions play crucial roles in the biological systems, and identifying protein\u2013DNA binding sites is the first step for mechanistic understanding of various biological activities (such as transcription and repair) and designing novel drugs. How to accurately identify DNA-binding residues from only protein sequence remains a challenging task. Currently, most existing sequence-based methods only consider contextual features of the sequential neighbors, which are limited to capture spatial information. Based on the recent breakthrough in protein structure prediction by AlphaFold2, we propose an accurate predictor, GraphSite, for identifying DNA-binding residues based on the structural models predicted by AlphaFold2. Here, we convert the binding site prediction problem into a graph node classification task and employ a transformer-based variant model to take the protein structural information into account. By leveraging predicted protein structures and graph transformer, GraphSite substantially improves over the latest sequence-based and structure-based methods. The algorithm is further confirmed on the independent test set of 181 proteins, where GraphSite surpasses the state-of-the-art structure-based method by 16.4% in area under the precision-recall curve and 11.2% in Matthews correlation coefficient, respectively. We provide the datasets, the predicted structures and the source codes along with the pre-trained models of GraphSite at https:\/\/github.com\/biomed-AI\/GraphSite. The GraphSite web server is freely available at https:\/\/biomed.nscc-gz.cn\/apps\/GraphSite.<\/jats:p>","DOI":"10.1093\/bib\/bbab564","type":"journal-article","created":{"date-parts":[[2021,12,10]],"date-time":"2021-12-10T15:13:28Z","timestamp":1639149208000},"source":"Crossref","is-referenced-by-count":120,"title":["AlphaFold2-aware protein\u2013DNA binding site prediction using graph transformer"],"prefix":"10.1093","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6098-9103","authenticated-orcid":false,"given":"Qianmu","family":"Yuan","sequence":"first","affiliation":[{"name":"School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sheng","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6840-8198","authenticated-orcid":false,"given":"Jiahua","family":"Rao","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9747-4285","authenticated-orcid":false,"given":"Shuangjia","family":"Zheng","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Huiying","family":"Zhao","sequence":"additional","affiliation":[{"name":"Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou 510000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuedong","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China"},{"name":"Key Laboratory of Machine Intelligence and Advanced Computing of MOE, Sun Yat-sen University, Guangzhou 510000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2022,1,17]]},"reference":[{"key":"2022031506260450800_ref1","doi-asserted-by":"crossref","first-page":"1857","DOI":"10.1093\/bioinformatics\/btq295","article-title":"Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function","volume":"26","author":"Zhao","year":"2010","journal-title":"Bioinformatics"},{"key":"2022031506260450800_ref2","doi-asserted-by":"crossref","first-page":"7364","DOI":"10.1093\/nar\/gkq617","article-title":"Genomic repertoires of DNA-binding transcription factors across the tree of life","volume":"38","author":"Charoensawan","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref3","doi-asserted-by":"crossref","first-page":"3575","DOI":"10.1093\/bioinformatics\/btx480","article-title":"Sequence2vec: a novel embedding approach for modeling transcription factor binding affinity landscape","volume":"33","author":"Dai","year":"2017","journal-title":"Bioinformatics"},{"key":"2022031506260450800_ref4","doi-asserted-by":"crossref","first-page":"E3692","DOI":"10.1073\/pnas.1714376115","article-title":"Accurate and sensitive quantification of protein-DNA binding affinity","volume":"115","author":"Rastogi","year":"2018","journal-title":"Proc Natl Acad Sci"},{"key":"2022031506260450800_ref5","doi-asserted-by":"crossref","first-page":"2730","DOI":"10.1093\/bioinformatics\/bty1068","article-title":"Promoter analysis and prediction in the human genome using sequence-based deep learning models","volume":"35","author":"Umarov","year":"2019","journal-title":"Bioinformatics"},{"key":"2022031506260450800_ref6","doi-asserted-by":"crossref","first-page":"W365","DOI":"10.1093\/nar\/gkx407","article-title":"HDOCK: a web server for protein\u2013protein and protein\u2013DNA\/RNA docking based on a hybrid strategy","volume":"45","author":"Yan","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref7","doi-asserted-by":"crossref","first-page":"930","DOI":"10.1093\/bioinformatics\/bty756","article-title":"Improving the prediction of protein\u2013nucleic acids binding residues via multiple sequence profiles and the consensus of complementary methods","volume":"35","author":"Su","year":"2019","journal-title":"Bioinformatics"},{"key":"2022031506260450800_ref8","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1002\/prot.22154","article-title":"Improving accuracy and efficiency of blind protein-ligand docking by focusing on predicted binding sites","volume":"74","author":"Ghersi","year":"2009","journal-title":"Proteins"},{"key":"2022031506260450800_ref9","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1016\/j.ajhg.2015.05.021","article-title":"THOC2 mutations implicate mRNA-export pathway in X-linked intellectual disability","volume":"97","author":"Kumar","year":"2015","journal-title":"Am J Hum Genet"},{"key":"2022031506260450800_ref10","doi-asserted-by":"crossref","first-page":"4498","DOI":"10.1172\/JCI91553","article-title":"JAK2-binding long noncoding RNA promotes breast cancer brain metastasis","volume":"127","author":"Wang","year":"2017","journal-title":"J Clin Invest"},{"key":"2022031506260450800_ref11","doi-asserted-by":"crossref","first-page":"1058","DOI":"10.1016\/j.febslet.2007.01.086","article-title":"Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions","volume":"581","author":"Bhardwaj","year":"2007","journal-title":"FEBS Lett"},{"key":"2022031506260450800_ref12","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1003341","article-title":"Structure-based function prediction of uncharacterized protein using binding sites comparison","volume":"9","author":"Konc","year":"2013","journal-title":"PLoS Comput Biol"},{"key":"2022031506260450800_ref13","doi-asserted-by":"crossref","first-page":"5858","DOI":"10.1021\/jm100574m","article-title":"Understanding and predicting druggability. A high-throughput method for detection of drug binding sites","volume":"53","author":"Schmidtke","year":"2010","journal-title":"J Med Chem"},{"key":"2022031506260450800_ref14","doi-asserted-by":"crossref","first-page":"3240","DOI":"10.1021\/acs.jcim.0c01494","article-title":"De novo molecule design through the molecular generative model conditioned by 3D information of protein binding sites","volume":"61","author":"Xu","year":"2021","journal-title":"J Chem Inf Model"},{"key":"2022031506260450800_ref15","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1016\/S0969-2126(97)00260-8","article-title":"CATH\u2013a hierarchic classification of protein domain structures","volume":"5","author":"Orengo","year":"1997","journal-title":"Structure"},{"key":"2022031506260450800_ref16","doi-asserted-by":"crossref","first-page":"2306","DOI":"10.1093\/nar\/26.10.2306","article-title":"Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites","volume":"26","author":"Mandel-Gutfreund","year":"1998","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2174\/0929867003375461","article-title":"Targeting DNA secondary structures","volume":"7","author":"Wadkins","year":"2000","journal-title":"Curr Med Chem"},{"key":"2022031506260450800_ref18","doi-asserted-by":"crossref","first-page":"17493","DOI":"10.3390\/ijms151017493","article-title":"DNA and RNA quadruplex-binding proteins","volume":"15","author":"Br\u00e1zda","year":"2014","journal-title":"Int J Mol Sci"},{"key":"2022031506260450800_ref19","doi-asserted-by":"crossref","first-page":"5922","DOI":"10.1093\/nar\/gkn573","article-title":"Protein\u2013DNA interactions: structural, thermodynamic and clustering patterns of conserved residues in DNA-binding proteins","volume":"36","author":"Ahmad","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref20","doi-asserted-by":"crossref","first-page":"3057","DOI":"10.1021\/acs.jcim.8b00749","article-title":"DNAPred: accurate identification of DNA-binding sites from protein sequence by ensembled hyperplane-distance-based support vector machines","volume":"59","author":"Zhu","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2022031506260450800_ref21","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab336","article-title":"DNAgenie: accurate prediction of DNA-type-specific binding residues in protein sequences","volume":"22","author":"Zhang","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022031506260450800_ref22","article-title":"NCBRPred: predicting nucleic acid binding residues in proteins based on multilabel learning","volume":"22","author":"Zhang","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022031506260450800_ref23","doi-asserted-by":"crossref","first-page":"7189","DOI":"10.1093\/nar\/gkg922","article-title":"Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins","volume":"31","author":"Jones","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref24","doi-asserted-by":"crossref","first-page":"885","DOI":"10.1002\/prot.20111","article-title":"Structure-based prediction of DNA-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces","volume":"55","author":"Tsuchiya","year":"2004","journal-title":"Proteins"},{"key":"2022031506260450800_ref25","doi-asserted-by":"crossref","first-page":"3036","DOI":"10.1093\/bioinformatics\/btx350","article-title":"DeepSite: protein-binding site predictor using 3D-convolutional neural networks","volume":"33","author":"Jim\u00e9nez","year":"2017","journal-title":"Bioinformatics"},{"key":"2022031506260450800_ref26","doi-asserted-by":"crossref","first-page":"e51","DOI":"10.1093\/nar\/gkab044","article-title":"GraphBind: protein structural context embedded rules learned by hierarchical graph neural networks for recognizing nucleic-acid-binding residues","volume":"49","author":"Xia","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref27","doi-asserted-by":"crossref","first-page":"1885","DOI":"10.1002\/prot.24330","article-title":"DNABind: a hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning-and template-based approaches","volume":"81","author":"Liu","year":"2013","journal-title":"Proteins"},{"key":"2022031506260450800_ref28","doi-asserted-by":"crossref","first-page":"W438","DOI":"10.1093\/nar\/gky439","article-title":"COACH-D: improved protein\u2013ligand binding sites prediction with refined ligand-binding poses through molecular docking","volume":"46","author":"Wu","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref29","doi-asserted-by":"crossref","first-page":"7606","DOI":"10.1093\/nar\/gkt544","article-title":"Novel approach for selecting the best predictor for identifying the binding sites in DNA binding proteins","volume":"41","author":"Nagarajan","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref30","first-page":"1","article-title":"Highly accurate protein structure prediction with AlphaFold","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2022031506260450800_ref31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-019-12920-0","article-title":"A deep learning framework to predict binding preference of RNA constituents on protein surface","volume":"10","author":"Lam","year":"2019","journal-title":"Nat Commun"},{"key":"2022031506260450800_ref32","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1038\/s42256-020-0152-y","article-title":"Predicting drug\u2013protein interaction using quasi-visual question answering system","volume":"2","author":"Zheng","year":"2020","journal-title":"Nat Mach Intell"},{"key":"2022031506260450800_ref33","doi-asserted-by":"crossref","first-page":"3814","DOI":"10.1021\/acs.jcim.1c00475","article-title":"Protein\u2013peptide binding site detection using 3D convolutional neural networks","volume":"61","author":"Kozlovskii","year":"2021","journal-title":"J Chem Inf Model"},{"key":"2022031506260450800_ref34","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1093\/bioinformatics\/btab643","article-title":"Structure-aware protein\u2013protein interaction site prediction using deep graph convolutional network","volume":"38","author":"Yuan","year":"2022","journal-title":"Bioinformatics"},{"key":"2022031506260450800_ref35","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1186\/s13321-021-00488-1","article-title":"Structure-aware protein solubility prediction from sequence through graph convolutional network and predicted contact map","volume":"13","author":"Chen","year":"2021","journal-title":"J Cheminfo"},{"key":"2022031506260450800_ref36","first-page":"5998","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani","year":"2017"},{"key":"2022031506260450800_ref37","first-page":"4171","article-title":"Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics","author":"Devlin","year":"2019"},{"key":"2022031506260450800_ref38","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1021\/acs.jcim.9b00949","article-title":"Predicting retrosynthetic reactions using self-corrected transformer neural networks","volume":"60","author":"Zheng","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2022031506260450800_ref39","doi-asserted-by":"crossref","first-page":"4406","DOI":"10.1093\/bioinformatics\/btaa524","article-title":"TransformerCPI: improving compound\u2013protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments","volume":"36","author":"Chen","year":"2020","journal-title":"Bioinformatics"},{"key":"2022031506260450800_ref40","first-page":"15820","article-title":"Generative models for graph-based protein design","volume":"32","author":"Ingraham","year":"2019","journal-title":"Adv Neural Inf Process Syst"},{"key":"2022031506260450800_ref41","doi-asserted-by":"crossref","first-page":"2242","DOI":"10.24963\/ijcai.2021\/309","volume-title":"Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence","author":"Chen","year":"2021"},{"key":"2022031506260450800_ref42","article-title":"Do Transformers Really Perform Badly for Graph Representation?","author":"Ying","year":"2021","journal-title":"Thirty-Fifth Conference on Neural Information Processing Systems"},{"key":"2022031506260450800_ref43","doi-asserted-by":"crossref","first-page":"D1096","DOI":"10.1093\/nar\/gks966","article-title":"BioLiP: a semi-manually curated database for biologically relevant ligand\u2013protein interactions","volume":"41","author":"Yang","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref44","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref45","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/bts565","article-title":"CD-HIT: accelerated for clustering the next-generation sequencing data","volume":"28","author":"Fu","year":"2012","journal-title":"Bioinformatics"},{"key":"2022031506260450800_ref46","doi-asserted-by":"crossref","first-page":"1282","DOI":"10.1093\/bioinformatics\/btm098","article-title":"UniRef: comprehensive and non-redundant UniProt reference clusters","volume":"23","author":"Suzek","year":"2007","journal-title":"Bioinformatics"},{"key":"2022031506260450800_ref47","first-page":"D570","article-title":"MGnify: the microbiome analysis resource in 2020","volume":"48","author":"Mitchell","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref48","doi-asserted-by":"crossref","first-page":"603","DOI":"10.1038\/s41592-019-0437-4","article-title":"Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold","volume":"16","author":"Steinegger","year":"2019","journal-title":"Nat Methods"},{"key":"2022031506260450800_ref49","doi-asserted-by":"crossref","first-page":"D170","DOI":"10.1093\/nar\/gkw1081","article-title":"Uniclust databases of clustered and deeply annotated protein sequences and alignments","volume":"45","author":"Mirdita","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref50","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-019-3019-7","article-title":"HH-suite3 for fast remote homology detection and deep protein annotation","volume":"20","author":"Steinegger","year":"2019","journal-title":"BMC Bioinform"},{"key":"2022031506260450800_ref51","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref52","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment","volume":"9","author":"Remmert","year":"2012","journal-title":"Nat Methods"},{"key":"2022031506260450800_ref53","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features","volume":"22","author":"Kabsch","year":"1983","journal-title":"Biopolymers"},{"key":"2022031506260450800_ref54","volume-title":"3rd International Conference on Learning Representations (Poster)","author":"Kingma","year":"2015"},{"key":"2022031506260450800_ref55","first-page":"8026","article-title":"Pytorch: an imperative style, high-performance deep learning library","volume":"32","author":"Paszke","year":"2019","journal-title":"Adv Neural Inf Process Syst"},{"key":"2022031506260450800_ref56","article-title":"Using deep neural networks and biological subwords to detect protein S-sulfenylation sites","volume":"22","author":"Do","year":"2020","journal-title":"Brief Bioinform"},{"key":"2022031506260450800_ref57","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab005","article-title":"A transformer architecture based on BERT and 2D convolutional neural network to identify DNA enhancers from sequence information","volume":"22","author":"Le","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022031506260450800_ref58","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0118432","article-title":"The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets","volume":"10","author":"Saito","year":"2015","journal-title":"PLoS One"},{"key":"2022031506260450800_ref59","first-page":"e84","article-title":"DRNApred, fast sequence-based method that accurately predicts and discriminates DNA-and RNA-binding residues","volume":"45","author":"Yan","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref60","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1214\/aoms\/1177729437","article-title":"Asymptotic theory of certain ``goodness of fit'' criteria based on stochastic processes","volume":"23","author":"Anderson","year":"1952","journal-title":"Ann Math Stat"},{"key":"2022031506260450800_ref61","doi-asserted-by":"crossref","first-page":"80","DOI":"10.2307\/3001968","article-title":"Individual comparisons by ranking methods","volume":"1","author":"Wilcoxon","year":"1945","journal-title":"Biometrics"},{"key":"2022031506260450800_ref62","doi-asserted-by":"crossref","first-page":"3370","DOI":"10.1093\/nar\/gkg571","article-title":"LGA: a method for finding 3D similarities in protein structures","volume":"31","author":"Zemla","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2022031506260450800_ref63","doi-asserted-by":"crossref","first-page":"2080","DOI":"10.1002\/prot.24100","article-title":"A new size-independent score for pairwise protein structure alignment and its application to structure classification and nucleic-acid binding prediction","volume":"80","author":"Yang","year":"2012","journal-title":"Proteins"},{"key":"2022031506260450800_ref64","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1214\/aoms\/1177730491","article-title":"On a test of whether one of two random variables is stochastically larger than the other","author":"Mann","year":"1947","journal-title":"Ann Math Stat"},{"key":"2022031506260450800_ref65","article-title":"Visualizing data using t-SNE","volume":"9","author":"Van der Maaten","year":"2008","journal-title":"J Mach Learn Res"},{"key":"2022031506260450800_ref66","doi-asserted-by":"crossref","DOI":"10.1109\/TCBB.2021.3118916","article-title":"To improve the predictions of binding residues with DNA, RNA, carbohydrate, and peptide via multi-task deep neural networks","author":"Sun","year":"2021","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/2\/bbab564\/42805314\/bbab564.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/2\/bbab564\/42805314\/bbab564.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,13]],"date-time":"2023-11-13T18:20:54Z","timestamp":1699899654000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbab564\/6509729"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,17]]},"references-count":66,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,3,10]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbab564","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.08.25.457661","asserted-by":"object"}]},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,3]]},"published":{"date-parts":[[2022,1,17]]},"article-number":"bbab564"}}