{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,19]],"date-time":"2025-12-19T09:25:41Z","timestamp":1766136341343},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"S15","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2012,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>In the context of drug discovery and development, much effort has been exerted to determine which conformers of a given molecule are responsible for the observed biological activity. In this work we aimed to predict bioactive conformers using a variant of supervised learning, named multiple-instance learning. A single molecule, treated as a bag of conformers, is biologically active if and only if at least one of its conformers, treated as an instance, is responsible for the observed bioactivity; and a molecule is inactive if none of its conformers is responsible for the observed bioactivity. The implementation requires instance-based embedding, and joint feature selection and classification. The goal of the present project is to implement multiple-instance learning in drug activity prediction, and subsequently to identify the bioactive conformers for each molecule.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Methods<\/jats:title>\n            <jats:p>We encoded the 3-dimensional structures using pharmacophore fingerprints which are binary strings, and accomplished instance-based embedding using calculated dissimilarity distances. Four dissimilarity measures were employed and their performances were compared. 1-norm SVM was used for joint feature selection and classification. The approach was applied to four data sets, and the best proposed model for each data set was determined by using the dissimilarity measure yielding the smallest number of selected features.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>The predictive abilities of the proposed approach were compared with three classical predictive models without instance-based embedding. The proposed approach produced the best predictive models for one data set and second best predictive models for the rest of the data sets, based on the external validations. To validate the ability of the proposed approach to find bioactive conformers, 12 small molecules with co-crystallized structures were seeded in one data set. 10 out of 12 co-crystallized structures were indeed identified as significant conformers using the proposed approach.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>The proposed approach was proven not to suffer from overfitting and to be highly competitive with classical predictive models, so it is very powerful for drug activity prediction. The approach was also validated as a useful method for pursuit of bioactive conformers.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-13-s15-s3","type":"journal-article","created":{"date-parts":[[2012,9,12]],"date-time":"2012-09-12T01:11:42Z","timestamp":1347412302000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["Implementation of multiple-instance learning in drug activity prediction"],"prefix":"10.1186","volume":"13","author":[{"given":"Gang","family":"Fu","sequence":"first","affiliation":[]},{"given":"Xiaofei","family":"Nan","sequence":"additional","affiliation":[]},{"given":"Haining","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Ronak Y","family":"Patel","sequence":"additional","affiliation":[]},{"given":"Pankaj R","family":"Daga","sequence":"additional","affiliation":[]},{"given":"Yixin","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Dawn E","family":"Wilkins","sequence":"additional","affiliation":[]},{"given":"Robert J","family":"Doerksen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2012,9,11]]},"reference":[{"key":"5346_CR1","doi-asserted-by":"publisher","first-page":"3297","DOI":"10.1021\/cr000095n","volume":"105","author":"F Fanelli","year":"2005","unstructured":"Fanelli F, De Benedetti PG: Computational modeling approaches to structure-function analysis of G protein-coupled receptors. Chemical Reviews 2005, 105: 3297\u20133351. 10.1021\/cr000095n","journal-title":"Chemical Reviews"},{"key":"5346_CR2","doi-asserted-by":"publisher","first-page":"928","DOI":"10.1002\/1439-7633(20021004)3:10<928::AID-CBIC928>3.0.CO;2-5","volume":"3","author":"T Klabunde","year":"2002","unstructured":"Klabunde T, Hessler G: Drug design strategies for targeting G-protein-coupled receptors. ChemBioChem 2002, 3: 928\u2013944. 10.1002\/1439-7633(20021004)3:10<928::AID-CBIC928>3.0.CO;2-5","journal-title":"ChemBioChem"},{"key":"5346_CR3","doi-asserted-by":"publisher","first-page":"1931","DOI":"10.1109\/TPAMI.2006.248","volume":"28","author":"Y Chen","year":"2006","unstructured":"Chen Y, Bi J, Wang JZ: MILES: Multiple-instance learning via embedded instance selection. IEEE Trans Pattern Anal Mach Intell 2006, 28: 1931\u20131947.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"5346_CR4","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1016\/S0004-3702(96)00034-3","volume":"89","author":"TG Dietterich","year":"1997","unstructured":"Dietterich TG, Lathrop RH, Lozano-Perez T: Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 1997, 89: 31\u201371. 10.1016\/S0004-3702(96)00034-3","journal-title":"Artif Intell"},{"key":"5346_CR5","first-page":"233","volume-title":"Proceedings of the 16th International Conference on Data Engineering:28 February - 3 March 2000; San Diego","author":"C Yang","year":"2000","unstructured":"Yang C, Lozano-Perez T: Image database retrieval with multiple-instance learning techniques. In Proceedings of the 16th International Conference on Data Engineering:28 February - 3 March 2000; San Diego. Edited by: David B. Lomet. Gerhard Weikum; 2000:233\u2013243."},{"key":"5346_CR6","first-page":"561","volume":"15","author":"S Andrews","year":"2003","unstructured":"Andrews S, Tsochantaridis I, Hofmann T: Support Vector Machines for Multiple-Instance Learning. Adv Neur In 2003, 15: 561\u2013568.","journal-title":"Adv Neur In"},{"key":"5346_CR7","first-page":"341","volume-title":"Proceedings of the 15th International Conference on Machine Learning: 24-27 July 1998; Madison","author":"O Maron","year":"1998","unstructured":"Maron O, Ratan AL: Multiple-Instance Learning for Natural Scene Classification. In Proceedings of the 15th International Conference on Machine Learning: 24\u201327 July 1998; Madison Edited by: Morgan Kaufmann. 1998, 341\u2013349."},{"key":"5346_CR8","doi-asserted-by":"publisher","first-page":"569","DOI":"10.1021\/ci980159j","volume":"39","author":"MJ McGregor","year":"1999","unstructured":"McGregor MJ, Muskal MM: Pharmacophore fingerprint. 1. Application to QSAR and focused library design. J Chem Inf Comput Sci 1999, 39: 569\u2013574. 10.1021\/ci980159j","journal-title":"J Chem Inf Comput Sci"},{"key":"5346_CR9","doi-asserted-by":"publisher","first-page":"3251","DOI":"10.1021\/jm9806998","volume":"42","author":"JS Mason","year":"1999","unstructured":"Mason JS, Morize I, Menard PR, Cheney DL, Hulme C, Labaudiniere RF: New 4-point pharmacophore method for molecular similarity and diversity applications: overview of the method and applications, including a novel approach to the design of combinatorial libraries containing privileged substructures. J Med Chem 1999, 42: 3251\u20133264. 10.1021\/jm9806998","journal-title":"J Med Chem"},{"key":"5346_CR10","doi-asserted-by":"publisher","first-page":"2770","DOI":"10.1021\/jm990578n","volume":"43","author":"EK Bradley","year":"2000","unstructured":"Bradley EK, Beroza P, Penzotti JE, Grootenhuis PDJ, Spellmeyer DC, Miller JL: A rapid computational method for lead evolution: description and application to alpha1-adrenergic antagonists. J Med Chem 2000, 43: 2770\u20132774. 10.1021\/jm990578n","journal-title":"J Med Chem"},{"key":"5346_CR11","doi-asserted-by":"publisher","first-page":"1737","DOI":"10.1021\/jm0255062","volume":"45","author":"JE Penzotti","year":"2002","unstructured":"Penzotti JE, Lamb ML, Evensen E, Grootenhuis PDJ: A computational ensemble pharmacophore model for identifying substrates of P-glycoprotein. J Med Chem 2002, 45: 1737\u20131740. 10.1021\/jm0255062","journal-title":"J Med Chem"},{"key":"5346_CR12","doi-asserted-by":"publisher","first-page":"2429","DOI":"10.1021\/ci700284p","volume":"47","author":"WX Li","year":"2007","unstructured":"Li WX, Li L, Eksterowicz J, Ling XB, Cardozo M: Significance analysis and multiple pharmacophore models for differentiating P-glycoprotein substrates. J Chem Inf Model 2007, 47: 2429\u20132438. 10.1021\/ci700284p","journal-title":"J Chem Inf Model"},{"key":"5346_CR13","doi-asserted-by":"publisher","first-page":"983","DOI":"10.1021\/ci9800211","volume":"38","author":"P Willett","year":"1998","unstructured":"Willett P, Barnard JM, Downs GM: Chemical similarity searching. J Chem Inf Comp Sci 1998, 38: 983\u2013996. 10.1021\/ci9800211","journal-title":"J Chem Inf Comp Sci"},{"key":"5346_CR14","doi-asserted-by":"publisher","first-page":"479","DOI":"10.1038\/nrd1415","volume":"3","author":"P Cohen","year":"2004","unstructured":"Cohen P, Goedert M: GSK3 inhibitors: Development and therapeutic potential. Nat Rev Drug Discov 2004, 3: 479\u2013487. 10.1038\/nrd1415","journal-title":"Nat Rev Drug Discov"},{"key":"5346_CR15","doi-asserted-by":"publisher","first-page":"1751","DOI":"10.2174\/138161206776873743","volume":"12","author":"S Pavlopoulos","year":"2006","unstructured":"Pavlopoulos S, Thakur GA, Nikas SP, Makriyannis A: Cannabinoid receptors as therapeutic targets. Curr Pharm Des 2006, 12: 1751\u20131769. 10.2174\/138161206776873743","journal-title":"Curr Pharm Des"},{"key":"5346_CR16","doi-asserted-by":"publisher","first-page":"778","DOI":"10.1592\/phco.21.9.778.34558","volume":"21","author":"CJ Matheny","year":"2001","unstructured":"Matheny CJ, Lamb MW, Brouwer KLR, Pollack GM: Pharmacokinetic and pharmacodynamic implications of P-glycoprotein modulation. Pharmacotherapy 2001, 21: 778\u2013796. 10.1592\/phco.21.9.778.34558","journal-title":"Pharmacotherapy"},{"key":"5346_CR17","doi-asserted-by":"publisher","first-page":"5116","DOI":"10.1073\/pnas.091062498","volume":"98","author":"VG Tusher","year":"2001","unstructured":"Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 2001, 98: 5116\u20135121. 10.1073\/pnas.091062498","journal-title":"Proc Natl Acad Sci USA"},{"key":"5346_CR18","first-page":"570","volume":"10","author":"O Maron","year":"1998","unstructured":"Maron O, Lozano-Perez T: A framework for multiple-instance learning. Adv Neur In 1998, 10: 570\u2013576.","journal-title":"Adv Neur In"},{"key":"5346_CR19","first-page":"1229","volume":"3","author":"J Bi","year":"2003","unstructured":"Bi J, Bennett KP, Embrechts M, Breneman C, Song M: Dimensionality reduction via sparse support vector machines. J Mach Learn Res 2003, 3: 1229\u20131243.","journal-title":"J Mach Learn Res"},{"key":"5346_CR20","volume-title":"Classification and Regression Trees","author":"L Breiman","year":"1984","unstructured":"Breiman L, Friedman J, Olshen RA, Stone CJ: Classification and Regression Trees. Belmont: Chapman & Hall; 1984."},{"key":"5346_CR21","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L: Random forests. Mach Learn 2001, 45: 5\u201332. 10.1023\/A:1010933404324","journal-title":"Mach Learn"},{"key":"5346_CR22","first-page":"273","volume":"20","author":"C Cortes","year":"1995","unstructured":"Cortes C, Vapnik V: Support-vector networks. Mach Learn 1995, 20: 273\u2013297.","journal-title":"Mach Learn"},{"key":"5346_CR23","doi-asserted-by":"publisher","first-page":"1947","DOI":"10.1021\/ci034160g","volume":"43","author":"V Svetnik","year":"2003","unstructured":"Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP: Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 2003, 43: 1947\u20131958. 10.1021\/ci034160g","journal-title":"J Chem Inf Comput Sci"},{"key":"5346_CR24","doi-asserted-by":"publisher","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","volume":"405","author":"BW Matthews","year":"1975","unstructured":"Matthews BW: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 1975, 405: 442\u2013451. 10.1016\/0005-2795(75)90109-9","journal-title":"Biochim Biophys Acta"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-13-S15-S3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T19:42:02Z","timestamp":1630525322000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-13-S15-S3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,9]]},"references-count":24,"journal-issue":{"issue":"S15","published-print":{"date-parts":[[2012,9]]}},"alternative-id":["5346"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-13-s15-s3","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,9]]},"assertion":[{"value":"11 September 2012","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S3"}}