{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,31]],"date-time":"2026-01-31T10:28:24Z","timestamp":1769855304725,"version":"3.49.0"},"reference-count":54,"publisher":"Springer Science and Business Media LLC","issue":"S13","license":[{"start":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T00:00:00Z","timestamp":1598918400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,17]],"date-time":"2020-09-17T00:00:00Z","timestamp":1600300800000},"content-version":"vor","delay-in-days":16,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Protein-DNA interaction governs a large number of cellular processes, and it can be altered by a small fraction of interface residues, i.e., the so-called<jats:italic>hot spots<\/jats:italic>, which account for most of the interface binding free energy. Accurate prediction of hot spots is critical to understand the principle of protein-DNA interactions. There are already some computational methods that can accurately and efficiently predict a large number of hot residues. However, the insufficiency of experimentally validated hot-spot residues in protein-DNA complexes and the low diversity of the employed features limit the performance of existing methods.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Here, we report a new computational method for effectively predicting hot spots in protein-DNA binding interfaces. This method, called<jats:italic>PreHots<\/jats:italic>(the abbreviation of<jats:italic>Pre<\/jats:italic>dicting<jats:italic>Hot<\/jats:italic><jats:italic>s<\/jats:italic>pots), adopts an ensemble stacking classifier that integrates different machine learning classifiers to generate a robust model with 19 features selected by a sequential backward feature selection algorithm. To this end, we constructed two new and reliable datasets (one benchmark for model training and one independent dataset for validation), which totally consist of 123 hot spots and 137 non-hot spots from 89 protein-DNA complexes. The data were manually collected from the literature and existing databases with a strict process of redundancy removal. Our method achieves a sensitivity of 0.813 and an AUC score of 0.868 in 10-fold cross-validation on the benchmark dataset, and a sensitivity of 0.818 and an AUC score of 0.820 on the independent test dataset. The results show that our approach outperforms the existing ones.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p><jats:italic>PreHots<\/jats:italic>, which is based on stack ensemble of boosting algorithms, can reliably predict hot spots at the protein-DNA binding interface on a large scale. Compared with the existing methods,<jats:italic>PreHots<\/jats:italic>can achieve better prediction performance. Both the webserver of<jats:italic>PreHots<\/jats:italic>and the datasets are freely available at:<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"http:\/\/dmb.tongji.edu.cn\/tools\/PreHots\/\">http:\/\/dmb.tongji.edu.cn\/tools\/PreHots\/<\/jats:ext-link>.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s12859-020-03675-3","type":"journal-article","created":{"date-parts":[[2020,9,17]],"date-time":"2020-09-17T00:03:04Z","timestamp":1600300984000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["Computationally identifying hot spots in protein-DNA binding interfaces using an ensemble approach"],"prefix":"10.1186","volume":"21","author":[{"given":"Yuliang","family":"Pan","sequence":"first","affiliation":[]},{"given":"Shuigeng","family":"Zhou","sequence":"additional","affiliation":[]},{"given":"Jihong","family":"Guan","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,9,17]]},"reference":[{"key":"3675_CR1","doi-asserted-by":"crossref","unstructured":"Berman MH. The protein data bank. Nucleic Acids Res; 28(1):235\u201342.","DOI":"10.1093\/nar\/28.1.235"},{"issue":"4","key":"3675_CR2","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1016\/S0092-8674(02)00655-4","volume":"108","author":"G Orphanides","year":"2002","unstructured":"Orphanides G, Reinberg D. A unified theory of gene expression. Cell. 2002; 108(4):439\u201351.","journal-title":"Cell"},{"key":"3675_CR3","doi-asserted-by":"crossref","unstructured":"Roeder R. Role of general and gene-specific cofactors in the regulation of eukaryotic transcription. In: Cold Spring Harbor Symposia on Quantitative Biology, vol. 63. Cold Spring Harbor Symposia on Quantitative Biology: 1998. p. 201\u201318.","DOI":"10.1101\/sqb.1998.63.201"},{"issue":"9","key":"3675_CR4","doi-asserted-by":"crossref","first-page":"1473","DOI":"10.1093\/bioinformatics\/btx822","volume":"34","author":"Y Pan","year":"2017","unstructured":"Pan Y, Wang Z, Zhan W, Deng L. Computational identification of binding energy hot spots in protein-RNA complexes using an ensemble approach. Bioinformatics. 2017; 34(9):1473\u201380.","journal-title":"Bioinformatics"},{"issue":"8","key":"3675_CR5","doi-asserted-by":"crossref","first-page":"2127","DOI":"10.1021\/bi061903t","volume":"46","author":"HF Teh","year":"2007","unstructured":"Teh HF, Peh WY, Su X, Thomsen JS. Characterization of protein-DNA interactions using surface plasmon resonance spectroscopy with various assay schemes. Biochemistry. 2007; 46(8):2127\u201335.","journal-title":"Biochemistry"},{"issue":"18","key":"3675_CR6","doi-asserted-by":"crossref","first-page":"950","DOI":"10.1021\/ac00217a002","volume":"62","author":"E Freire","year":"1990","unstructured":"Freire E, Mayorga OL, Straume M. Isothermal titration calorimetry. Anal Chem. 1990; 62(18):950\u20139.","journal-title":"Anal Chem"},{"issue":"2","key":"3675_CR7","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1016\/S0959-440X(00)00190-1","volume":"11","author":"A Hillisch","year":"2001","unstructured":"Hillisch A, Lorenz M, Diekmann S. Recent advances in fret: distance determination in protein-DNA complexes. Curr Opin Struct Biol. 2001; 11(2):201\u20137.","journal-title":"Curr Opin Struct Biol"},{"issue":"5","key":"3675_CR8","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1093\/bioinformatics\/btx698","volume":"34","author":"Y Peng","year":"2017","unstructured":"Peng Y, Sun L, Jia Z, Li L, Alexov E. Predicting protein-DNA binding free energy change upon missense mutations using modified MM\/PBSA approach: SAMPDI webserver. Bioinformatics. 2017; 34(5):779\u201386.","journal-title":"Bioinformatics"},{"issue":"12","key":"3675_CR9","doi-asserted-by":"crossref","first-page":"1006615","DOI":"10.1371\/journal.pcbi.1006615","volume":"14","author":"N Zhang","year":"2018","unstructured":"Zhang N, Chen Y, Zhao F, Yang Q, Simonetti FL, Li M. PremPDI estimates and interprets the effects of missense mutations on protein-DNA interactions. PLoS Comput Biol. 2018; 14(12):1006615.","journal-title":"PLoS Comput Biol"},{"issue":"1","key":"3675_CR10","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1021\/ci100275a","volume":"51","author":"T Hou","year":"2010","unstructured":"Hou T, Wang J, Li Y, Wang W. Assessing the performance of the MM\/PBSA and MM\/GBSA methods. 1. the accuracy of binding free energy calculations based on molecular dynamics simulations. J Chem Inf Model. 2010; 51(1):69\u201382.","journal-title":"J Chem Inf Model"},{"issue":"W1","key":"3675_CR11","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1093\/nar\/gkx236","volume":"45","author":"DE Pires","year":"2017","unstructured":"Pires DE, Ascher DB. mCSM-NA: predicting the effects of mutations on protein\u2013nucleic acids interactions. Nucleic Acids Res. 2017; 45(W1):241\u20136.","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"3675_CR12","doi-asserted-by":"crossref","first-page":"1038","DOI":"10.1093\/bib\/bbz037","volume":"21","author":"S Zhang","year":"2019","unstructured":"Zhang S, Zhao L, Zheng C-H, Xia J. A feature-based approach to predict hot spots in protein-DNA binding interfaces. Brief Bioinform. 2019; 21(3):1038\u201346.","journal-title":"Brief Bioinform"},{"issue":"suppl_1","key":"3675_CR13","doi-asserted-by":"crossref","first-page":"204","DOI":"10.1093\/nar\/gkj103","volume":"34","author":"MS Kumar","year":"2006","unstructured":"Kumar MS, Bava KA, Gromiha MM, Prabakaran P, Kitajima K, Uedaira H, Sarai A. Protherm and pronit: thermodynamic databases for proteins and protein\u2013nucleic acid interactions. Nucleic Acids Res. 2006; 34(suppl_1):204\u20136.","journal-title":"Nucleic Acids Res"},{"key":"3675_CR14","doi-asserted-by":"publisher","unstructured":"Liu L, Xiong Y, Gao H, Wei D-Q, Mitchell JC, Zhu X. dbAMEPNI: a database of alanine mutagenic effects for protein\u2013nucleic acid interactions. Database. 2018; 2018. https:\/\/doi.org\/10.1093\/database\/bay034.","DOI":"10.1093\/database\/bay034"},{"key":"3675_CR15","unstructured":"Dorogush AV, Ershov V, Gulin A. Catboost: gradient boosting with categorical features support. 2018. arXiv preprint arXiv:1810.11363."},{"key":"3675_CR16","doi-asserted-by":"crossref","unstructured":"Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: In 22nd SIGKDD Conference on Knowledge Discovery and Data Mining. ACM: 2016. p. 785\u201394.","DOI":"10.1145\/2939672.2939785"},{"issue":"4","key":"3675_CR17","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1016\/S0167-9473(01)00065-2","volume":"38","author":"JH Friedman","year":"2002","unstructured":"Friedman JH. Stochastic gradient boosting. Comput Stat Data Anal. 2002; 38(4):367\u201378.","journal-title":"Comput Stat Data Anal"},{"issue":"3","key":"3675_CR18","first-page":"497","volume":"68","author":"RE Wright","year":"1995","unstructured":"Wright RE. Logistic regression. Reading & Understanding Multivariate Stats. 1995; 68(3):497\u201307.","journal-title":"Reading & Understanding Multivariate Stats"},{"key":"3675_CR19","unstructured":"Hubbard SJ, Thornton JM. Naccess. Computer Program, Department of Biochemistry and Molecular Biology, University College London. 1993; 2(1)."},{"issue":"13","key":"3675_CR20","doi-asserted-by":"crossref","first-page":"1658","DOI":"10.1093\/bioinformatics\/btl158","volume":"22","author":"W Li","year":"2006","unstructured":"Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006; 22(13):1658\u20139.","journal-title":"Bioinformatics"},{"issue":"6","key":"3675_CR21","doi-asserted-by":"crossref","first-page":"1419","DOI":"10.1007\/s00726-014-1710-6","volume":"46","author":"W Yan","year":"2014","unstructured":"Yan W, Zhou J, Sun M, Chen J, Hu G, Shen B. The construction of an amino acid network for understanding protein structure and function. Amino Acids. 2014; 46(6):1419\u201339.","journal-title":"Amino Acids"},{"issue":"W1","key":"3675_CR22","doi-asserted-by":"crossref","first-page":"375","DOI":"10.1093\/nar\/gkw383","volume":"44","author":"B Chakrabarty","year":"2016","unstructured":"Chakrabarty B, Parekh N. NAPS: Network analysis of protein structures. Nucleic Acids Res. 2016; 44(W1):375\u201382.","journal-title":"Nucleic Acids Res"},{"issue":"6","key":"3675_CR23","first-page":"0179314","volume":"12","author":"Y Pan","year":"2017","unstructured":"Pan Y, Liu D, Deng L. Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties. PLoS ONE. 2017; 12(6):0179314.","journal-title":"PLoS ONE"},{"issue":"6136","key":"3675_CR24","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1038\/329263a0","volume":"329","author":"M Hogan","year":"1987","unstructured":"Hogan M, Austin RH. Importance of DNA stiffness in protein-DNA binding specificity. Nature. 1987; 329(6136):263.","journal-title":"Nature"},{"issue":"13","key":"3675_CR25","doi-asserted-by":"crossref","first-page":"2860","DOI":"10.1093\/nar\/29.13.2860","volume":"29","author":"NM Luscombe","year":"2001","unstructured":"Luscombe NM, Laskowski RA, Thornton JM. Amino acid\u2013base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level. Nucleic Acids Res. 2001; 29(13):2860\u201374.","journal-title":"Nucleic Acids Res"},{"issue":"7268","key":"3675_CR26","doi-asserted-by":"crossref","first-page":"1248","DOI":"10.1038\/nature08473","volume":"461","author":"R Rohs","year":"2009","unstructured":"Rohs R, West SM, Sosinsky A, Liu P, Mann RS, Honig B. The role of dna shape in protein-DNA recognition. Nature. 2009; 461(7268):1248.","journal-title":"Nature"},{"issue":"W1","key":"3675_CR27","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1093\/nar\/gkw315","volume":"44","author":"D Piovesan","year":"2016","unstructured":"Piovesan D, Minervini G, Tosatto SC. The RING 2.0 web server for high quality residue interaction networks. Nucleic Acids Res. 2016; 44(W1):367\u201374.","journal-title":"Nucleic Acids Res"},{"issue":"11","key":"3675_CR28","doi-asserted-by":"crossref","first-page":"878","DOI":"10.1089\/cmb.2013.0083","volume":"20","author":"L Deng","year":"2013","unstructured":"Deng L, Guan J, Wei X, Yi Y, Zhang QC, Zhou S. Boosting prediction performance of protein-protein interaction hot spots by using structural neighborhood properties. J Comput Biol. 2013; 20(11):878\u201391.","journal-title":"J Comput Biol"},{"issue":"Webserver-Issue","key":"3675_CR29","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1093\/nar\/gku437","volume":"42","author":"L Deng","year":"2014","unstructured":"Deng L, Zhang QC, Chen Z, Meng Y, Guan J, Zhou S. PredHS: a web server for predicting protein\u2013protein interaction hot spots by using structural neighborhood properties. Nucleic Acids Res. 2014; 42(Webserver-Issue):290\u20135.","journal-title":"Nucleic Acids Res"},{"issue":"12","key":"3675_CR30","doi-asserted-by":"crossref","first-page":"1513","DOI":"10.1093\/bioinformatics\/btp240","volume":"25","author":"N Tuncbag","year":"2009","unstructured":"Tuncbag N, Gursoy A, Keskin O. Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy. Bioinformatics. 2009; 25(12):1513\u201320.","journal-title":"Bioinformatics"},{"issue":"1","key":"3675_CR31","doi-asserted-by":"crossref","first-page":"426","DOI":"10.1186\/1471-2105-10-426","volume":"10","author":"L Deng","year":"2009","unstructured":"Deng L, Guan J, Dong Q, Zhou S. Prediction of protein-protein interaction sites using an ensemble method. BMC Bioinformatics. 2009; 10(1):426.","journal-title":"BMC Bioinformatics"},{"issue":"13","key":"3675_CR32","doi-asserted-by":"crossref","first-page":"1489","DOI":"10.1093\/bioinformatics\/btn222","volume":"24","author":"J Song","year":"2008","unstructured":"Song J, Tan H, Takemoto K, Akutsu T. HSEpred: predict half-sphere exposure from protein sequences. Bioinformatics. 2008; 24(13):1489\u201397.","journal-title":"Bioinformatics"},{"issue":"1","key":"3675_CR33","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1002\/prot.20379","volume":"59","author":"T Hamelryck","year":"2005","unstructured":"Hamelryck T. An amino acid has two sides: a new 2D measure provides a different view of solvent exposure. Proteins Struct Funct Bioinforma. 2005; 59(1):38\u201348.","journal-title":"Proteins Struct Funct Bioinforma"},{"key":"3675_CR34","first-page":"2403","volume":"10","author":"J Hanson","year":"2018","unstructured":"Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y. Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks. Bioinformatics. 2018; 10:2403\u201310.","journal-title":"Bioinformatics"},{"issue":"5","key":"3675_CR35","doi-asserted-by":"crossref","first-page":"777","DOI":"10.1006\/jmbi.1994.1334","volume":"238","author":"IK McDonald","year":"1994","unstructured":"McDonald IK, Thornton JM. Satisfying hydrogen bonding potential in proteins. J Mol Biol. 1994; 238(5):777\u201393.","journal-title":"J Mol Biol"},{"issue":"2","key":"3675_CR36","doi-asserted-by":"crossref","first-page":"397","DOI":"10.1002\/prot.22252","volume":"75","author":"S Liang","year":"2009","unstructured":"Liang S, Meroueh SO, Wang G, Qiu C, Zhou Y. Consensus scoring for enriching near-native structures from protein\u2013protein docking decoys. Proteins Struct Funct Bioinforma. 2009; 75(2):397\u2013403.","journal-title":"Proteins Struct Funct Bioinforma"},{"issue":"12","key":"3675_CR37","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","volume":"22","author":"W Kabsch","year":"1983","unstructured":"Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers Orig Res Biomol. 1983; 22(12):2577\u2013637.","journal-title":"Biopolymers Orig Res Biomol"},{"issue":"6","key":"3675_CR38","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1002\/prot.25674","volume":"87","author":"MS Klausen","year":"2019","unstructured":"Klausen MS, Jespersen MC, Nielsen H, Jensen KK, Jurtz VI, Soenderby CK, Sommer MOA, Winther O, Nielsen M, Petersen B, et al. Netsurfp-2.0: Improved prediction of protein structural features by integrated deep learning. Proteins Struct Funct Bioinforma. 2019; 87(6):520\u20137.","journal-title":"Proteins Struct Funct Bioinforma"},{"issue":"18","key":"3675_CR39","doi-asserted-by":"crossref","first-page":"2842","DOI":"10.1093\/bioinformatics\/btx218","volume":"33","author":"R Heffernan","year":"2017","unstructured":"Heffernan R, Yang Y, Paliwal K, Zhou Y. Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility. Bioinformatics. 2017; 33(18):2842\u20139.","journal-title":"Bioinformatics"},{"issue":"5","key":"3675_CR40","doi-asserted-by":"crossref","first-page":"1425","DOI":"10.1002\/prot.24040","volume":"80","author":"M Jamroz","year":"2012","unstructured":"Jamroz M, Kolinski A, Kihara D. Structural features that predict real-value fluctuations of globular proteins. Proteins Struct Funct Bioinforma. 2012; 80(5):1425\u201335.","journal-title":"Proteins Struct Funct Bioinforma"},{"issue":"15","key":"3675_CR41","doi-asserted-by":"crossref","first-page":"1875","DOI":"10.1093\/bioinformatics\/btm270","volume":"23","author":"JA Capra","year":"2007","unstructured":"Capra JA, Singh M. Predicting functionally important residues from sequence conservation. Bioinformatics. 2007; 23(15):1875\u201382.","journal-title":"Bioinformatics"},{"issue":"1","key":"3675_CR42","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1093\/nar\/28.1.374","volume":"28","author":"S Kawashima","year":"2000","unstructured":"Kawashima S, Kanehisa M. AAindex: amino acid index database. Nucleic Acids Res. 2000; 28(1):374.","journal-title":"Nucleic Acids Res"},{"issue":"22","key":"3675_CR43","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","volume":"89","author":"S Henikoff","year":"1992","unstructured":"Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci. 1992; 89(22):10915\u20139.","journal-title":"Proc Natl Acad Sci"},{"issue":"4","key":"3675_CR44","doi-asserted-by":"crossref","first-page":"684","DOI":"10.1002\/prot.20263","volume":"57","author":"C-H Chan","year":"2004","unstructured":"Chan C-H, Liang H-K, Hsiao N-W, Ko M-T, Lyu P-C, Hwang J-K. Relationship between local structural entropy and protein thermostabilty. Proteins Struct Funct Bioinforma. 2004; 57(4):684\u201391.","journal-title":"Proteins Struct Funct Bioinforma"},{"issue":"5","key":"3675_CR45","doi-asserted-by":"crossref","first-page":"685","DOI":"10.1093\/bioinformatics\/btw678","volume":"33","author":"J Hanson","year":"2016","unstructured":"Hanson J, Yang Y, Paliwal K, Zhou Y. Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks. Bioinformatics. 2016; 33(5):685\u2013692.","journal-title":"Bioinformatics"},{"issue":"W1","key":"3675_CR46","doi-asserted-by":"crossref","first-page":"430","DOI":"10.1093\/nar\/gkw306","volume":"44","author":"S Wang","year":"2016","unstructured":"Wang S, Li W, Liu S, Xu J. Raptorx-property: a web server for protein structure property prediction. Nucleic Acids Res. 2016; 44(W1):430\u20135.","journal-title":"Nucleic Acids Res"},{"key":"3675_CR47","doi-asserted-by":"crossref","unstructured":"Van der Laan MJ, Polley EC, Hubbard AE. Super learner. Stat Appl Genet Mol Biol. 2007; 6(1).","DOI":"10.2202\/1544-6115.1309"},{"issue":"1","key":"3675_CR48","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L. Random forests. Mach Learn. 2001; 45(1):5\u201332.","journal-title":"Mach Learn"},{"issue":"3","key":"3675_CR49","first-page":"27","volume":"2","author":"C-C Chang","year":"2011","unstructured":"Chang C-C, Lin C-J. Libsvm: A library for support vector machines. ACM Trans Intell Syst Technol (TIST). 2011; 2(3):27.","journal-title":"ACM Trans Intell Syst Technol (TIST)"},{"issue":"1-3","key":"3675_CR50","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1023\/A:1012487302797","volume":"46","author":"I Guyon","year":"2002","unstructured":"Guyon I, Weston J, Barnhill S, Vapnik V. Gene selection for cancer classification using support vector machines. Mach Learn. 2002; 46(1-3):389\u2013422.","journal-title":"Mach Learn"},{"key":"3675_CR51","doi-asserted-by":"publisher","unstructured":"Peng H, Long F, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell. 2005; 8:1226\u201338. https:\/\/doi.org\/10.1109\/tpami.2005.159.","DOI":"10.1109\/tpami.2005.159"},{"key":"3675_CR52","doi-asserted-by":"publisher","unstructured":"Climente-Gonz\u00e1lez H, Azencott C-A, Kaski S, Yamada M. Block hsic lasso: model-free biomarker detection for ultra-high dimensional data. bioRxiv. 2019:532192. https:\/\/doi.org\/10.1093\/bioinformatics\/btz333.","DOI":"10.1093\/bioinformatics\/btz333"},{"issue":"39","key":"3675_CR53","doi-asserted-by":"crossref","first-page":"6139","DOI":"10.1021\/acs.biochem.5b00707","volume":"54","author":"X Pan","year":"2015","unstructured":"Pan X, Smith CE, Zhang J, McCabe KA, Fu J, Bell CE. A structure\u2013activity analysis for probing the mechanism of processive double-stranded DNA digestion by \u03bb exonuclease trimers. Biochemistry. 2015; 54(39):6139\u201348.","journal-title":"Biochemistry"},{"key":"3675_CR54","doi-asserted-by":"crossref","first-page":"4595","DOI":"10.1038\/ncomms5595","volume":"5","author":"S Amrane","year":"2014","unstructured":"Amrane S, Rebora K, Zniber I, Dupuy D, Mackereth CD. Backbone-independent nucleic acid binding by splicing factor sup-12 reveals key aspects of molecular recognition. Nat Commun. 2014; 5:4595.","journal-title":"Nat Commun"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03675-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-020-03675-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03675-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,7]],"date-time":"2023-10-07T11:57:17Z","timestamp":1696679837000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-03675-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9]]},"references-count":54,"journal-issue":{"issue":"S13","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["3675"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-03675-3","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9]]},"assertion":[{"value":"17 September 2020","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"384"}}