{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,30]],"date-time":"2026-03-30T02:27:07Z","timestamp":1774837627318,"version":"3.50.1"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2012,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>There is a need for automated methods to learn general features of the interactions of a ligand class with its diverse set of protein receptors. An appropriate machine learning approach is Inductive Logic Programming (ILP), which automatically generates comprehensible rules in addition to prediction. The development of ILP systems which can learn rules of the complexity required for studies on protein structure remains a challenge. In this work we use a new ILP system, ProGolem, and demonstrate its performance on learning features of hexose-protein interactions.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>The rules induced by ProGolem detect interactions mediated by aromatics and by planar-polar residues, in addition to less common features such as the aromatic sandwich. The rules also reveal a previously unreported dependency for residues <jats:sc>cys<\/jats:sc> and <jats:sc>leu<\/jats:sc>. They also specify interactions involving aromatic and hydrogen bonding residues. This paper shows that Inductive Logic Programming implemented in ProGolem can derive rules giving structural features of protein\/ligand interactions. Several of these rules are consistent with descriptions in the literature.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>In addition to confirming literature results, ProGolem\u2019s model has a 10-fold cross-validated predictive accuracy that is superior, at the 95% confidence level, to another ILP system previously used to study protein\/hexose interactions and is comparable with state-of-the-art statistical learners.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-13-162","type":"journal-article","created":{"date-parts":[[2012,7,11]],"date-time":"2012-07-11T14:19:36Z","timestamp":1342016376000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Automated identification of protein-ligand interaction features using Inductive Logic Programming: a hexose binding case study"],"prefix":"10.1186","volume":"13","author":[{"given":"Jose C","family":"A Santos","sequence":"first","affiliation":[]},{"given":"Houssam","family":"Nassif","sequence":"additional","affiliation":[]},{"given":"David","family":"Page","sequence":"additional","affiliation":[]},{"given":"Stephen H","family":"Muggleton","sequence":"additional","affiliation":[]},{"given":"Michael J","family":"E Sternberg","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2012,7,11]]},"reference":[{"key":"5375_CR1","volume-title":"Biology","author":"E Solomon","year":"2007","unstructured":"Solomon E, Berg L, Martin DW: Biology. 2007, Brooks Cole, Belmont, CA"},{"issue":"7","key":"5375_CR2","doi-asserted-by":"publisher","first-page":"467","DOI":"10.1093\/protein\/gzg065","volume":"16","author":"C Shionyu-Mitsuyama","year":"2003","unstructured":"Shionyu-Mitsuyama C, Shirai T, Ishida H, Yamane T: An empirical approach for structure-based prediction of carbohydrate-binding sites on proteins. Protein Eng. 2003, 16 (7): 467-478. 10.1093\/protein\/gzg065.","journal-title":"Protein Eng"},{"issue":"9","key":"5375_CR3","doi-asserted-by":"publisher","first-page":"2502","DOI":"10.1110\/ps.04812804","volume":"13","author":"MS Sujatha","year":"2004","unstructured":"Sujatha MS, Sasidhar YU, Balaji PV: Energetics of galactose- and glucose-aromatic amino acid interactions: implications for binding in galactose-specific proteins. Protein Sci. 2004, 13 (9): 2502-2514. 10.1110\/ps.04812804.","journal-title":"Protein Sci"},{"issue":"29","key":"5375_CR4","doi-asserted-by":"publisher","first-page":"10153","DOI":"10.1073\/pnas.0504023102","volume":"102","author":"R Chakrabarti","year":"2005","unstructured":"Chakrabarti R, Klibanov AM, Friesner RA: Computational prediction of native protein ligand-binding and enzyme active site sequences. Proc Nat Acad Sci USA. 2005, 102 (29): 10153-10158. 10.1073\/pnas.0504023102.","journal-title":"Proc Nat Acad Sci USA"},{"key":"5375_CR5","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1186\/1472-6807-10-23","volume":"10","author":"AC Doxey","year":"2010","unstructured":"Doxey AC, Cheng Z, Moffatt BA, McConkey BJ: Structural motif screening reveals a novel, conserved carbohydrate-binding surface in the pathogenesis-related protein PR-5d. BMC Struct Biol. 2010, 10: 23-10.1186\/1472-6807-10-23.","journal-title":"BMC Struct Biol"},{"issue":"5","key":"5375_CR6","doi-asserted-by":"publisher","first-page":"1112","DOI":"10.1016\/j.jmb.2005.11.044","volume":"355","author":"ND Gold","year":"2006","unstructured":"Gold ND, Jackson RM: Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships. J Mol Biol. 2006, 355 (5): 1112-1124. 10.1016\/j.jmb.2005.11.044.","journal-title":"J Mol Biol"},{"key":"5375_CR7","doi-asserted-by":"publisher","first-page":"W595","DOI":"10.1093\/nar\/gkq398","volume":"38","author":"G Cipriano","year":"2010","unstructured":"Cipriano G, Wesenberg G, Grim T, Jr GNP, Gleicher M: GRAPE: GRaphical Abstracted Protein Explorer. Nucleic Acids Res. 2010, 38: W595-W601. 10.1093\/nar\/gkq398.","journal-title":"Nucleic Acids Res"},{"key":"5375_CR8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1472-6807-7-1","volume":"7","author":"A Malik","year":"2007","unstructured":"Malik A, Ahmad S: Sequence and structural features of carbohydrate binding in proteins and assessment of predictability using a neural network. BMC Struct Biol. 2007, 7: 1-10.1186\/1472-6807-7-1.","journal-title":"BMC Struct Biol"},{"key":"5375_CR9","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1002\/prot.22424","volume":"77","author":"H Nassif","year":"2009","unstructured":"Nassif H, Al-Ali H, Khuri S, Keirouz W: Prediction of protein-glucose binding sites using support vector machines. Proteins. 2009, 77: 121-132. 10.1002\/prot.22424.","journal-title":"Proteins"},{"issue":"5","key":"5375_CR10","doi-asserted-by":"publisher","first-page":"1195","DOI":"10.1002\/prot.22639","volume":"78","author":"T Kawabata","year":"2010","unstructured":"Kawabata T: Detection of multi-scale pockets on protein surfaces using mathematical morphology. Proteins. 2010, 78 (5): 1195-1211. 10.1002\/prot.22639.","journal-title":"Proteins"},{"key":"5375_CR11","first-page":"1","volume-title":"Proceedings of the IEEE Congress on Evolutionary Computation","author":"GY Wong","year":"2010","unstructured":"Wong GY, Leung FH: Predicting Protein-Ligand Binding Site with Support Vector Machine. Proceedings of the IEEE Congress on Evolutionary Computation. 2010, , Barcelona, Spain, 1-5."},{"key":"5375_CR12","first-page":"149","volume-title":"Proceedings of the 19th International Conference on ILP","author":"H Nassif","year":"2009","unstructured":"Nassif H, Al-Ali H, Khuri S, Keirouz W, Page D: An inductive logic programming approach to validate hexose biochemical knowledge. Proceedings of the 19th International Conference on ILP. 2009, , Leuven, Belgium, 149-165."},{"key":"5375_CR13","unstructured":"Srinivasan A: The Aleph Manual. 4th 2007. [http:\/\/www.comlab.ox.ac.uk\/activities\/machinelearning\/Aleph\/aleph.html]."},{"key":"5375_CR14","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1007\/3540635149_56","volume-title":"Proceedings of the 7th International Workshop on Inductive Logic Programming","author":"A Srinivasan","year":"1997","unstructured":"Srinivasan A, King RD, Muggleton SH, Sternberg MJE: Carcinogenesis predictions using ILP. Proceedings of the 7th International Workshop on Inductive Logic Programming. 1997, , Prague, Czech Republic, 273-287."},{"key":"5375_CR15","first-page":"349","volume-title":"American Medical Informatics Association (AMIA\u201911) Symposium Proceedings","author":"I Dutra","year":"2011","unstructured":"Dutra I, Nassif H, Page D, Shavlik J, Strigel R, Wu Y, EM E, Burnside E: Integrating machine learning and physician knowledge to improve the accuracy of breast biopsy. American Medical Informatics Association (AMIA\u201911) Symposium Proceedings. 2011, , Washington, DC, 349-355."},{"issue":"2\u20133","key":"5375_CR16","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1023\/A:1007460424845","volume":"30","author":"P Finn","year":"1998","unstructured":"Finn P, Muggleton S, Page D, Srinivasan A: Pharmacophore discovery using the inductive logic programming system PROGOL. Machine Learning. 1998, 30 (2\u20133): 241-270.","journal-title":"Machine Learning"},{"key":"5375_CR17","first-page":"273","volume-title":"Proceedings of the 4th International Workshop on Machine Learning in Systems Biology","author":"A Szaboova","year":"2010","unstructured":"Szaboova A, Kuzelka O, Zelezny F, Tolar J: Prediction of DNA-binding proteins from structural features. Proceedings of the 4th International Workshop on Machine Learning in Systems Biology. 2010, , Edinburgh, 273-287."},{"key":"5375_CR18","first-page":"131","volume-title":"Proceedings of the 19th International Conference on ILP","author":"S Muggleton","year":"2009","unstructured":"Muggleton S, Santos J, Tamaddoni-Nezhad A: ProGolem: a system based on relative minimal generalisation. Proceedings of the 19th International Conference on ILP. 2009, Springer, Leuven, Belgium, 131-148."},{"key":"5375_CR19","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1093\/nar\/28.1.235","volume":"28","author":"HM Berman","year":"2000","unstructured":"Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The protein data bank. Nucleic Acids Res. 2000, 28: 235-242. 10.1093\/nar\/28.1.235.","journal-title":"Nucleic Acids Res"},{"key":"5375_CR20","volume-title":"Organic Chemistry","author":"MA Fox","year":"2004","unstructured":"Fox MA, Whitesell JK: Organic Chemistry. 2004, Jones & Bartlett Publishers, Boston, MA"},{"issue":"12","key":"5375_CR21","doi-asserted-by":"publisher","first-page":"1589","DOI":"10.1093\/bioinformatics\/btg224","volume":"19","author":"G Wang","year":"2003","unstructured":"Wang G, Dunbrack RL: PISCES: a protein sequence culling server. Bioinformatics. 2003, 19 (12): 1589-1591. 10.1093\/bioinformatics\/btg224.","journal-title":"Bioinformatics"},{"issue":"15","key":"5375_CR22","doi-asserted-by":"publisher","first-page":"5648","DOI":"10.1073\/pnas.87.15.5648","volume":"87","author":"MM Yamashita","year":"1990","unstructured":"Yamashita MM, Wesson L, Eisenman G, Eisenberg D: Where metal ions bind in proteins. Proceedings of the National Academy of Sciences USA. 1990, 87 (15): 5648-5652. 10.1073\/pnas.87.15.5648.","journal-title":"Proceedings of the National Academy of Sciences USA"},{"key":"5375_CR23","volume-title":"Machine Learning","author":"TM Mitchell","year":"1997","unstructured":"Mitchell TM: Machine Learning. 1997, McGraw-Hill International Editions, Singapore"},{"issue":"3\/4","key":"5375_CR24","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1016\/0004-3702(71)90012-9","volume":"2","author":"RA Kowalski","year":"1971","unstructured":"Kowalski RA, Kuehner D: Linear resolution with selection function. Artif Intelligence. 1971, 2 (3\/4): 227-260.","journal-title":"Artif Intelligence"},{"key":"5375_CR25","first-page":"172","volume-title":"Technical communications of the International Conference on Logic Programming","author":"J Santos","year":"2010","unstructured":"Santos J, Muggleton S: Subsumer: A Prolog theta-subsumption engine. Technical communications of the International Conference on Logic Programming. 2010, Edinburgh, Scotland, , 172-181."},{"key":"5375_CR26","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1002\/prot.10612","volume":"55","author":"MS Sujatha","year":"2004","unstructured":"Sujatha MS, Balaji PV: Identification of common structural features of binding sites in galactose-specific proteins. Proteins. 2004, 55: 44-65. 10.1002\/prot.10612.","journal-title":"Proteins"},{"key":"5375_CR27","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L: Random forests. Machine Learning. 2001, 45: 5-32. 10.1023\/A:1010933404324.","journal-title":"Machine Learning"},{"key":"5375_CR28","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1186\/1471-2105-7-3","volume":"7","author":"R D\u00edaz-Uriarte","year":"2006","unstructured":"D\u00edaz-Uriarte R, de Andr\u00e9s: Gene selection and classification of microarray data using random forest. BMC Bioinf. 2006, 7: 3-10.1186\/1471-2105-7-3.","journal-title":"BMC Bioinf"},{"key":"5375_CR29","first-page":"1","volume-title":"Proceedings of the 24th International Conference on Logic Programming","author":"V Santos Costa","year":"2008","unstructured":"Santos Costa V: The life of a logic programming system. Proceedings of the 24th International Conference on Logic Programming. Edited by: de la Banda MG, Pontelli E. 2008, Springer-Verlag, Udine, Italy, 1-6."},{"key":"5375_CR30","doi-asserted-by":"publisher","first-page":"1093","DOI":"10.1016\/S0969-2126(97)00260-8","volume":"5","author":"C Orengo","year":"1997","unstructured":"Orengo C, Michie A, Jones S, Jones D, Swindells M: CATH\u2014a hierarchic classification of protein domain structures. Structure. 1997, 5: 1093-1108. 10.1016\/S0969-2126(97)00260-8.","journal-title":"Structure"},{"key":"5375_CR31","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1109\/34.824819","volume":"22","author":"AK Jain","year":"2000","unstructured":"Jain AK, Duin RPW, Mao J: Statistical pattern recognition: a review. IEEE Trans Pattern Analysis Machine Intelligence. 2000, 22: 4-37. 10.1109\/34.824819.","journal-title":"IEEE Trans Pattern Analysis Machine Intelligence"},{"key":"5375_CR32","volume-title":"Statistical Learning Theory","author":"VN Vapnik","year":"1998","unstructured":"Vapnik VN: Statistical Learning Theory. 1998, John Wiley & Sons, New York"},{"key":"5375_CR33","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1023\/A:1008280620621","volume":"7","author":"I Kononenko","year":"1997","unstructured":"Kononenko I, Simec E, Robnik-Sikonja M: Overcoming the myopia of inductive learning algorithms with RELIEFF. Appl Intell. 1997, 7: 39-55. 10.1023\/A:1008280620621.","journal-title":"Appl Intell"},{"issue":"4","key":"5375_CR34","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1016\/S0141-8130(98)00056-7","volume":"23","author":"VSR Rao","year":"1998","unstructured":"Rao VSR, Lam K, Qasba PK: Architecture of the sugar binding sites in carbohydrate binding proteins\u2014a computer modeling study. Int J Biol Macromol. 1998, 23 (4): 295-307. 10.1016\/S0141-8130(98)00056-7.","journal-title":"Int J Biol Macromol"},{"issue":"46","key":"5375_CR35","doi-asserted-by":"publisher","first-page":"13512","DOI":"10.1021\/bi035430r","volume":"42","author":"Y Zhang","year":"2003","unstructured":"Zhang Y, Swaminathan GJ, Deshpande A, Boix E, Natesh R, Xie Z, Acharya KR, Brew K: Roles of individual enzyme-substrate interactions by alpha-1,3-galactosyltransferase in catalysis and specificity. Biochemistry. 2003, 42 (46): 13512-13521. 10.1021\/bi035430r.","journal-title":"Biochemistry"},{"key":"5375_CR36","first-page":"441","volume-title":"Bioorganic Chemistry: Carbohydrates","author":"FA Quiocho","year":"1999","unstructured":"Quiocho FA, Vyas NK: Atomic interactions between proteins\/enzymes and carbohydrates. Bioorganic Chemistry: Carbohydrates. Edited by: Hecht SM. 1999, Oxford University Press, New York, 441-457."},{"key":"5375_CR37","doi-asserted-by":"publisher","first-page":"3644","DOI":"10.1002\/anie.200605116","volume":"46","author":"J Screen","year":"2007","unstructured":"Screen J, Stanca-Kaposta EC, Gamblin DP, Liu B, Macleod NA, Snoek LC, Davis BG, Simons JP: IR-spectral signatures of aromatic\u2013sugar complexes: probing carbohydrate\u2013protein interactions. Angew Chem Int Ed. 2007, 46: 3644-3648. 10.1002\/anie.200605116.","journal-title":"Angew Chem Int Ed"},{"key":"5375_CR38","doi-asserted-by":"publisher","first-page":"769","DOI":"10.1042\/BJ20040892","volume":"382","author":"AB Boraston","year":"2004","unstructured":"Boraston AB, Bolam DN, Gilbert HJ, Davies GJ: Carbohydrate-binding modules: fine-tuning polysaccharide recognition. Biochem J. 2004, 382: 769-781. 10.1042\/BJ20040892.","journal-title":"Biochem J"},{"issue":"2","key":"5375_CR39","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1093\/protein\/13.2.89","volume":"13","author":"C Taroni","year":"2000","unstructured":"Taroni C, Jones S, Thornton JM: Analysis and prediction of carbohydrate binding sites. Protein Eng. 2000, 13 (2): 89-98. 10.1093\/protein\/13.2.89.","journal-title":"Protein Eng"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-13-162.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T19:46:33Z","timestamp":1630525593000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-13-162"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,7,11]]},"references-count":39,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2012,12]]}},"alternative-id":["5375"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-13-162","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,7,11]]},"assertion":[{"value":"8 September 2011","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 June 2012","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 July 2012","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"162"}}