{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T03:22:57Z","timestamp":1774322577715,"version":"3.50.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2016,9,25]],"date-time":"2016-09-25T00:00:00Z","timestamp":1474761600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["81173568"],"award-info":[{"award-number":["81173568"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["81373985"],"award-info":[{"award-number":["81373985"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["81430094"],"award-info":[{"award-number":["81430094"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100004602","name":"Program for New Century Excellent Talents in University","doi-asserted-by":"crossref","award":["NCET-11-0605"],"award-info":[{"award-number":["NCET-11-0605"]}],"id":[{"id":"10.13039\/501100004602","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,2,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>The metabolites of exogenous and endogenous compounds play a pivotal role in the domain of metabolism research. However, they are still unclear for most chemicals in our environment. The in silico methods for predicting the site of metabolism (SOM) are considered to be efficient and low-cost in SOM discovery. However, many in silico methods are focused on metabolism processes catalyzed by several specified Cytochromes P450s, and only apply to substrates with special skeleton. A SOM prediction model always deserves more attention, which demands no special requirements to structures of substrates and applies to more metabolic enzymes.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>By incorporating the use of hybrid feature selection techniques (CHI, IG, GR, Relief) and multiple classification procedures (KStar, BN, IBK, J48, RF, SVM, AdaBoostM1, Bagging), SOM prediction models for six oxidation reactions mediated by oxidoreductases were established by the integration of enzyme data and chemical bond information. The advantage of the method is the introduction of unlabeled SOM. We defined the SOM which not reported in the literature as unlabeled SOM, where negative SOM was filtered. Consequently, for each type of reaction, a series of SOM prediction models were built based on information about metabolism of 1237 heterogeneous chemicals. Then optimal models were attained through comparisons among these models. Finally, independent test set was used to validate optimal models. It demonstrated that all models gave accuracies above 0.90. For receiver operating characteristic analysis, the area under curve values of all these models over 0.906. The results suggested that these models showed good predicting power.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>All the models will be available when contact with wangyun@bucm.edu.cn<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btw617","type":"journal-article","created":{"date-parts":[[2016,9,26]],"date-time":"2016-09-26T00:07:40Z","timestamp":1474848460000},"page":"363-372","source":"Crossref","is-referenced-by-count":16,"title":["Site of metabolism prediction for oxidation reactions mediated by oxidoreductases based on chemical bond"],"prefix":"10.1093","volume":"33","author":[{"given":"Shuaibing","family":"He","sequence":"first","affiliation":[{"name":"Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, China"}]},{"given":"Manman","family":"Li","sequence":"additional","affiliation":[{"name":"Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, China"}]},{"given":"Xiaotong","family":"Ye","sequence":"additional","affiliation":[{"name":"Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, China"}]},{"given":"Hongyu","family":"Wang","sequence":"additional","affiliation":[{"name":"Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, China"}]},{"given":"Wenkang","family":"Yu","sequence":"additional","affiliation":[{"name":"Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, China"}]},{"given":"Wenjing","family":"He","sequence":"additional","affiliation":[{"name":"Xinjiang Medical University Institute of TCM, Urumuqi, China"}]},{"given":"Yun","family":"Wang","sequence":"additional","affiliation":[{"name":"Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, China"}]},{"given":"Yanjiang","family":"Qiao","sequence":"additional","affiliation":[{"name":"Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, China"}]}],"member":"286","published-online":{"date-parts":[[2016,9,25]]},"reference":[{"key":"2023020204412231000_btw617-B1","doi-asserted-by":"crossref","first-page":"S8.","DOI":"10.1186\/1471-2105-16-S4-S8","article-title":"BgN-Score and BsN-Score: bagging and boosting based ensemble neural networks scoring functions for accurate binding affinity prediction of protein-ligand complexes","volume":"16","author":"Ashtawy","year":"2015","journal-title":"BMC Bioinformatics"},{"key":"2023020204412231000_btw617-B2","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach. Learn"},{"key":"2023020204412231000_btw617-B3","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1145\/1961189.1961199","article-title":"LIBSVM: a library for support vector machines","volume":"2","author":"Chang","year":"2011","journal-title":"ACM Trans. Intell. Syst. Technol"},{"key":"2023020204412231000_btw617-B4","doi-asserted-by":"crossref","first-page":"18162","DOI":"10.3390\/ijms151018162","article-title":"Towards global QSAR model building for acute toxicity: Munro database case study","volume":"15","author":"Chavan","year":"2014","journal-title":"Int. J. Mol. Sci"},{"key":"2023020204412231000_btw617-B5","first-page":"108","article-title":"K*: an instance-based learner using an entropic distance measure","volume":"1996","author":"Cleary","year":"1996","journal-title":"Mach. Learn. Proc"},{"key":"2023020204412231000_btw617-B6","doi-asserted-by":"crossref","first-page":"6970","DOI":"10.1021\/jm050529c","article-title":"MetaSite: understanding metabolism in human cytochromes from the perspective of the chemist","volume":"48","author":"Cruciani","year":"2005","journal-title":"J. Med. Chem"},{"key":"2023020204412231000_btw617-B7","doi-asserted-by":"crossref","first-page":"14677","DOI":"10.3390\/ijms160714677","article-title":"A mechanism-based model for the prediction of the metabolic sites of steroids mediated by cytochrome P450 3A4","volume":"16","author":"Dai","year":"2015","journal-title":"Int. J. Mol. Sci"},{"key":"2023020204412231000_btw617-B8","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1016\/0006-291X(78)91643-1","article-title":"Aliphatic hydroxylation by highly purified liver microsomal cytochrome P-450. Evidence for a carbon radical intermediate","volume":"81","author":"Groves","year":"1978","journal-title":"Biochem. Biophys. Res. Commun"},{"key":"2023020204412231000_btw617-B9","doi-asserted-by":"crossref","first-page":"e140330.","DOI":"10.1371\/journal.pone.0140330","article-title":"Pain intensity recognition rates via biopotential feature patterns with support vector machines","volume":"10","author":"Gruss","year":"2015","journal-title":"Plos One"},{"key":"2023020204412231000_btw617-B10","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1016\/j.bbapap.2010.07.017","article-title":"Flexibility of human cytochrome P450 enzymes: molecular dynamics and spectroscopy reveal important function-related variations","volume":"1814","author":"Hendrychova","year":"2011","journal-title":"Biochim. Biophys. Acta"},{"key":"2023020204412231000_btw617-B11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2014\/432109","article-title":"Feature selection for better identification of subtypes of Guillain-Barr\u00e9 syndrome","volume":"2014","author":"Hern\u00e1ndez-Torruco","year":"2014","journal-title":"Comput. Math. Method Med"},{"key":"2023020204412231000_btw617-B12","doi-asserted-by":"crossref","first-page":"131.","DOI":"10.1186\/1472-6947-12-131","article-title":"Decision tree-based learning to predict patient controlled analgesia consumption and readjustment","volume":"12","author":"Hu","year":"2012","journal-title":"BMC Med. Inform. Decis. Mak"},{"key":"2023020204412231000_btw617-B13","doi-asserted-by":"crossref","first-page":"12.","DOI":"10.1186\/s13321-016-0124-8","article-title":"GPURFSCREEN: a GPU based virtual screening tool using random forest classifier","volume":"8","author":"Jayaraj","year":"2016","journal-title":"J. Cheminform"},{"key":"2023020204412231000_btw617-B14","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1016\/S0090-9556(25)07306-4","article-title":"Putative active site template model for cytochrome P4502C9 (tolbutamide hydroxylase)","volume":"24","author":"Jones","year":"1996","journal-title":"Drug Metab. Dispos"},{"key":"2023020204412231000_btw617-B15","doi-asserted-by":"crossref","first-page":"986","DOI":"10.1007\/s11095-014-1511-3","article-title":"Combining structure- and ligand-based approaches to improve site of metabolism prediction in CYP2C9 substrates","volume":"32","author":"Kingsley","year":"2015","journal-title":"Pharm. Res"},{"key":"2023020204412231000_btw617-B16","doi-asserted-by":"crossref","first-page":"617","DOI":"10.1021\/ci200542m","article-title":"Computational prediction of metabolism: sites, products, SAR, P450 enzyme dynamics, and mechanisms","volume":"52","author":"Kirchmair","year":"2012","journal-title":"J. Chem. Inf. Model"},{"key":"2023020204412231000_btw617-B17","doi-asserted-by":"crossref","first-page":"591","DOI":"10.1007\/BF02708366","article-title":"Study of atomic and condensed atomic indices for reactive sites of molecules","volume":"117","author":"Kolandaivel","year":"2005","journal-title":"J. Chem. Sci"},{"key":"2023020204412231000_btw617-B18","volume-title":"Estimating Attributes: Analysis and Extensions of RELIEF","author":"Kononenko","year":"1994"},{"key":"2023020204412231000_btw617-B19","doi-asserted-by":"crossref","first-page":"42.","DOI":"10.1186\/1471-2091-12-42","article-title":"BKM-react, an integrated biochemical reaction database","volume":"12","author":"Lang","year":"2011","journal-title":"BMC Biochem"},{"key":"2023020204412231000_btw617-B20","doi-asserted-by":"crossref","first-page":"843","DOI":"10.1007\/s10822-008-9225-4","article-title":"Considerations and recent advances in QSAR models for cytochrome P450-mediated drug metabolism prediction","volume":"22","author":"Li","year":"2008","journal-title":"J. Comput. Aided Mol. Des"},{"key":"2023020204412231000_btw617-B21","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1111\/j.1365-2753.2005.00598.x","article-title":"Measuring diagnostic and predictive accuracy in disease management: an introduction to receiver operating characteristic (ROC) analysis","volume":"12","author":"Linden","year":"2006","journal-title":"J. Eval. Clin. Pract"},{"key":"2023020204412231000_btw617-B22","doi-asserted-by":"crossref","first-page":"1698","DOI":"10.1021\/ci3001524","article-title":"2D SMARTCyp reactivity-based site of metabolism prediction for major drug-metabolizing cytochrome P450 enzymes","volume":"52","author":"Liu","year":"2012","journal-title":"J. Chem. Inf. Model"},{"key":"2023020204412231000_btw617-B23","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1080\/15376510701857320","article-title":"In silico tools for sharing data and knowledge on toxicity and metabolism: Derek for windows, meteor, and vitic","volume":"18","author":"Marchant","year":"2008","journal-title":"Toxicol. Mech. Methods"},{"key":"2023020204412231000_btw617-B24","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1145\/1656274.1656278","article-title":"The WEKA data mining software: An update","volume":"11","author":"Mark Hall","year":"2009","journal-title":"SIGKDD Explorations"},{"key":"2023020204412231000_btw617-B25","doi-asserted-by":"crossref","first-page":"120","DOI":"10.2174\/1389200215666140130125339","article-title":"Advances in methods for predicting phase I metabolism of polyphenols","volume":"15","author":"Melo-Filho","year":"2014","journal-title":"Curr. Drug Metab"},{"key":"2023020204412231000_btw617-B26","doi-asserted-by":"crossref","first-page":"2141","DOI":"10.1007\/s10916-011-9678-1","article-title":"SVM feature selection based rotation forest ensemble classifiers to improve computer-aided diagnosis of Parkinson disease","volume":"36","author":"Ozcift","year":"2012","journal-title":"J. Med. Syst"},{"key":"2023020204412231000_btw617-B27","doi-asserted-by":"crossref","first-page":"450531.","DOI":"10.1155\/2015\/450531","article-title":"A computer-aided diagnosis system for dynamic contrast-enhanced MR images based on level set segmentation and ReliefF feature selection","volume":"2015","author":"Pang","year":"2015","journal-title":"Comput. Math. Methods Med"},{"key":"2023020204412231000_btw617-B28","doi-asserted-by":"crossref","first-page":"464","DOI":"10.1016\/j.patcog.2012.08.007","article-title":"Stochastic margin-based structure learning of Bayesian network classifiers","volume":"46","author":"Pernkopf","year":"2013","journal-title":"Pattern Recognit"},{"key":"2023020204412231000_btw617-B29","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1007\/BF00116251","article-title":"Induction of decision trees","volume":"1","author":"Quinlan","year":"1986","journal-title":"Mach. Learn"},{"key":"2023020204412231000_btw617-B30","doi-asserted-by":"crossref","first-page":"e58772.","DOI":"10.1371\/journal.pone.0058772","article-title":"Improved classification of lung cancer tumors based on structural and physicochemical properties of proteins using data mining models","volume":"8","author":"Ramani","year":"2013","journal-title":"Plos One"},{"key":"2023020204412231000_btw617-B31","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1016\/j.neucom.2003.10.011","article-title":"Margin maximization with feed-forward neural networks: a comparative study with SVM and AdaBoost","volume":"57","author":"Romero","year":"2004","journal-title":"Neurocomputing"},{"key":"2023020204412231000_btw617-B32","doi-asserted-by":"crossref","first-page":"2507","DOI":"10.1093\/bioinformatics\/btm344","article-title":"A review of feature selection techniques in bioinformatics","volume":"23","author":"Saeys","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020204412231000_btw617-B33","first-page":"58","article-title":"Research of the improved Adaboost Algorithm based on unbalanced data","volume":"15","author":"Shang","year":"2015","journal-title":"Int. J. Comput. Sci. Netw. Security"},{"key":"2023020204412231000_btw617-B34","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1016\/j.jmgm.2014.09.005","article-title":"Effects of protein flexibility on the site of metabolism prediction for CYP2A6 substrates","volume":"54","author":"Sheng","year":"2014","journal-title":"J. Mol. Graph. Model"},{"key":"2023020204412231000_btw617-B35","doi-asserted-by":"crossref","first-page":"3173","DOI":"10.1021\/jm0613471","article-title":"Empirical regioselectivity models for human cytochromes P450 3A4, 2D6, and 2C9","volume":"50","author":"Sheridan","year":"2007","journal-title":"J. Med. Chem"},{"key":"2023020204412231000_btw617-B36","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1517\/17425255.2011.553599","article-title":"In silico site of metabolism prediction of cytochrome P450-mediated biotransformations","volume":"7","author":"Tarcsay","year":"2011","journal-title":"Expert. Opin. Drug Metab. Toxicol"},{"key":"2023020204412231000_btw617-B37","doi-asserted-by":"crossref","first-page":"495","DOI":"10.2165\/00003088-200544050-00003","article-title":"Pharmacokinetic profile of ganciclovir after its oral administration and from its prodrug, valganciclovir, in solid organ transplant recipients","volume":"44","author":"Wiltshire","year":"2005","journal-title":"Clin. Pharmacokinet"},{"key":"2023020204412231000_btw617-B38","doi-asserted-by":"crossref","first-page":"1667","DOI":"10.1021\/ci2000488","article-title":"RS-Predictor: a new tool for predicting sites of cytochrome P450-mediated metabolism applied to CYP 3A4","volume":"51","author":"Zaretzki","year":"2011","journal-title":"J. Chem. Inf. Model"},{"key":"2023020204412231000_btw617-B39","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1016\/j.jep.2014.04.039","article-title":"Absorption and metabolism of three monoester-diterpenoid alkaloids in Aconitum carmichaeli after oral administration to rats by HPLC-MS","volume":"154","author":"Zhang","year":"2014","journal-title":"J. Ethnopharmacol"},{"key":"2023020204412231000_btw617-B40","doi-asserted-by":"crossref","first-page":"1775","DOI":"10.1021\/ci0502707","article-title":"Structure-based classification of chemical reactions without assignment of reaction centers","volume":"45","author":"Zhang","year":"2005","journal-title":"J. Chem. Inf. Model"},{"key":"2023020204412231000_btw617-B41","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1145\/1007730.1007741","article-title":"Feature selection for text categorization on imbalanced data","volume":"6","author":"Zheng","year":"2004","journal-title":"ACM SIGKDD Explorations"},{"key":"2023020204412231000_btw617-B42","doi-asserted-by":"crossref","first-page":"1251","DOI":"10.1093\/bioinformatics\/btp140","article-title":"Site of metabolism prediction for six biotransformations mediated by cytochromes P450","volume":"25","author":"Zheng","year":"2009","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/3\/363\/49037731\/bioinformatics_33_3_363.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/3\/363\/49037731\/bioinformatics_33_3_363.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,10]],"date-time":"2025-06-10T21:13:14Z","timestamp":1749589994000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/3\/363\/2726820"}},"subtitle":[],"editor":[{"given":"Anna","family":"Tramontano","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2016,9,25]]},"references-count":42,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2017,2,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw617","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,2,1]]},"published":{"date-parts":[[2016,9,25]]}}}