{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,11]],"date-time":"2025-11-11T22:24:56Z","timestamp":1762899896115,"version":"3.37.3"},"reference-count":54,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2020,3,12]],"date-time":"2020-03-12T00:00:00Z","timestamp":1583971200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01GM124061"],"award-info":[{"award-number":["R01GM124061"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>A unique challenge in predictive model building for omics data has been the small number of samples (n) versus the large amount of features (p). This \u2018n\u226ap\u2019 property brings difficulties for disease outcome classification using deep learning techniques. Sparse learning by incorporating known functional relationships between the biological units, such as the graph-embedded deep feedforward network (GEDFN) model, has been a solution to this issue. However, such methods require an existing feature graph, and potential mis-specification of the feature graph can be harmful on classification and feature selection.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>To address this limitation and develop a robust classification model without relying on external knowledge, we propose a forest graph-embedded deep feedforward network (forgeNet) model, to integrate the GEDFN architecture with a forest feature graph extractor, so that the feature graph can be learned in a supervised manner and specifically constructed for a given prediction task. To validate the method\u2019s capability, we experimented the forgeNet model with both synthetic and real datasets. The resulting high classification accuracy suggests that the method is a valuable addition to sparse deep learning models for omics data.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>The method is available at https:\/\/github.com\/yunchuankong\/forgeNet.<\/jats:p><\/jats:sec><jats:sec><jats:title>Contact<\/jats:title><jats:p>tianwei.yu@emory.edu<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa164","type":"journal-article","created":{"date-parts":[[2020,3,9]],"date-time":"2020-03-09T12:44:38Z","timestamp":1583757878000},"page":"3507-3515","source":"Crossref","is-referenced-by-count":26,"title":["forgeNet: a graph deep neural network model using tree-based ensemble classifiers for feature graph construction"],"prefix":"10.1093","volume":"36","author":[{"given":"Yunchuan","family":"Kong","sequence":"first","affiliation":[{"name":"Department of Biostatistics and Bioinformatics , Emory University, Atlanta, GA 30322, USA"}]},{"given":"Tianwei","family":"Yu","sequence":"additional","affiliation":[{"name":"Department of Biostatistics and Bioinformatics , Emory University, Atlanta, GA 30322, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,3,12]]},"reference":[{"first-page":"265","year":"2016","author":"Abadi","key":"2023062312020443700_btaa164-B1"},{"key":"2023062312020443700_btaa164-B2","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1126\/science.286.5439.509","article-title":"Emergence of scaling in random networks","volume":"286","author":"Barab\u00e1si","year":"1999","journal-title":"Science"},{"key":"2023062312020443700_btaa164-B3","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-349-03521-2","volume-title":"Graph Theory with Applications","author":"Bondy","year":"1976"},{"key":"2023062312020443700_btaa164-B4","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach. Learn"},{"key":"2023062312020443700_btaa164-B5","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn"},{"key":"2023062312020443700_btaa164-B6","doi-asserted-by":"crossref","first-page":"791","DOI":"10.1039\/C4MB00659C","article-title":"Classification of lung cancer using ensemble-based feature selection and machine learning methods","volume":"11","author":"Cai","year":"2015","journal-title":"Mol. Biosyst"},{"key":"2023062312020443700_btaa164-B7","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1186\/s13058-019-1107-2","article-title":"AMP-activated protein kinase: a potential therapeutic target for triple-negative breast cancer","volume":"21","author":"Cao","year":"2019","journal-title":"Breast Cancer Res"},{"key":"2023062312020443700_btaa164-B8","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1093\/glycob\/cwy003","article-title":"Keratan sulfate, a complex glycosaminoglycan with unique functional capability","volume":"28","author":"Caterson","year":"2018","journal-title":"Glycobiology"},{"year":"2016","author":"Chen","key":"2023062312020443700_btaa164-B9"},{"key":"2023062312020443700_btaa164-B10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.compbiomed.2014.02.006","article-title":"Risk classification of cancer survival using ANN with gene expression data from multiple laboratories","volume":"48","author":"Chen","year":"2014","journal-title":"Comput. Biol. Med"},{"key":"2023062312020443700_btaa164-B11","doi-asserted-by":"crossref","DOI":"10.1093\/database\/bau126","article-title":"Comparison of human cell signaling pathway databases\u2014evolution, drawbacks and challenges","volume":"2015","author":"Chowdhury","year":"2015","journal-title":"Database (Oxford)"},{"key":"2023062312020443700_btaa164-B12","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1186\/1752-0509-6-92","article-title":"HINT: high-quality protein interactomes and their applications in understanding human disease","volume":"6","author":"Das","year":"2012","journal-title":"BMC Syst. Biol"},{"key":"2023062312020443700_btaa164-B13","doi-asserted-by":"crossref","first-page":"e1002180","DOI":"10.1371\/journal.pcbi.1002180","article-title":"Protein networks as logic functions in development and cancer","volume":"7","author":"Dutkowski","year":"2011","journal-title":"PLoS Comput. Biol"},{"first-page":"290","year":"1959","author":"Erd\u00f6s","key":"2023062312020443700_btaa164-B14"},{"key":"2023062312020443700_btaa164-B15","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1093\/bioinformatics\/btl567","article-title":"Using GOstats to test gene lists for GO term association","volume":"23","author":"Falcon","year":"2007","journal-title":"Bioinformatics"},{"key":"2023062312020443700_btaa164-B16","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1016\/S0167-9473(01)00065-2","article-title":"Stochastic gradient boosting","volume":"38","author":"Friedman","year":"2002","journal-title":"Comput. Stat. Data Anal"},{"volume-title":"Deep Learning","year":"2016","author":"Goodfellow","key":"2023062312020443700_btaa164-B17"},{"year":"2001","author":"Hochreiter","key":"2023062312020443700_btaa164-B18"},{"key":"2023062312020443700_btaa164-B19","doi-asserted-by":"crossref","first-page":"286","DOI":"10.1186\/1471-2407-8-286","article-title":"Upregulated HSP27 in human breast cancer cells reduces Herceptin susceptibility by increasing Her2 protein stability","volume":"8","author":"Kang","year":"2008","journal-title":"BMC Cancer"},{"key":"2023062312020443700_btaa164-B20","doi-asserted-by":"crossref","first-page":"582","DOI":"10.1111\/biom.12035","article-title":"Network-based penalized regression with application to genomic data","volume":"69","author":"Kim","year":"2013","journal-title":"Biometrics"},{"year":"2014","author":"Kingma","key":"2023062312020443700_btaa164-B21"},{"key":"2023062312020443700_btaa164-B22","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1038\/nature11412","article-title":"Comprehensive molecular portraits of human breast tumours","volume":"490","author":"Koboldt","year":"2012","journal-title":"Nature"},{"key":"2023062312020443700_btaa164-B23","doi-asserted-by":"crossref","first-page":"3727","DOI":"10.1093\/bioinformatics\/bty429","article-title":"A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data","volume":"34","author":"Kong","year":"2018","journal-title":"Bioinformatics"},{"key":"2023062312020443700_btaa164-B24","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1186\/1471-2105-15-8","article-title":"Robustness of random forest-based gene selection methods","volume":"15","author":"Kursa","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023062312020443700_btaa164-B25","doi-asserted-by":"crossref","first-page":"694","DOI":"10.1089\/cmb.2012.0065","article-title":"Network-induced classification kernels for gene expression profile analysis","volume":"19","author":"Lavi","year":"2012","journal-title":"J. Comput. Biol"},{"key":"2023062312020443700_btaa164-B26","doi-asserted-by":"crossref","first-page":"9665","DOI":"10.1038\/s41598-019-46046-6","article-title":"Role of keratan sulfate expression in human pancreatic cancer malignancy","volume":"9","author":"Leiphrakpam","year":"2019","journal-title":"Sci. Rep"},{"key":"2023062312020443700_btaa164-B27","doi-asserted-by":"crossref","first-page":"e1003123","DOI":"10.1371\/journal.pcbi.1003123","article-title":"Predicting network activity from high throughput metabolomics","volume":"9","author":"Li","year":"2013","journal-title":"PLoS Comput. Biol"},{"key":"2023062312020443700_btaa164-B28","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1186\/1471-2105-14-198","article-title":"Sparse logistic regression with a l 1\/2 penalty for gene selection in cancer classification","volume":"14","author":"Liang","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023062312020443700_btaa164-B29","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1016\/S0169-409X(03)00033-4","article-title":"Mammary physiology and milk secretion","volume":"55","author":"McManaman","year":"2003","journal-title":"Adv. Drug Deliv. Rev"},{"key":"2023062312020443700_btaa164-B30","doi-asserted-by":"crossref","first-page":"R31","DOI":"10.1186\/bcr2853","article-title":"Breast cancer cell migration is regulated through junctional adhesion molecule-A-mediated activation of Rap1 GTPase","volume":"13","author":"McSherry","year":"2011","journal-title":"Breast Cancer Res"},{"first-page":"851","year":"2016","author":"Min","key":"2023062312020443700_btaa164-B31"},{"key":"2023062312020443700_btaa164-B32","doi-asserted-by":"crossref","first-page":"29487","DOI":"10.18632\/oncotarget.15494","article-title":"Fatty acid metabolism in breast cancer subtypes","volume":"8","author":"Monaco","year":"2017","journal-title":"Oncotarget"},{"first-page":"807","year":"2010","author":"Nair","key":"2023062312020443700_btaa164-B33"},{"key":"2023062312020443700_btaa164-B34","doi-asserted-by":"crossref","first-page":"69","DOI":"10.3389\/fendo.2018.00069","article-title":"Proteoglycans-biomarkers and targets in cancer therapy","volume":"9","author":"Nikitovic","year":"2018","journal-title":"Front. Endocrinol. (Lausanne)"},{"key":"2023062312020443700_btaa164-B35","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023062312020443700_btaa164-B36","doi-asserted-by":"crossref","first-page":"045004","DOI":"10.1088\/1478-3975\/11\/4\/045004","article-title":"Modeling and analysis of transport in the mammary glands","volume":"11","author":"Quezada","year":"2014","journal-title":"Phys. Biol"},{"key":"2023062312020443700_btaa164-B37","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1007\/978-1-4939-2425-7_3","article-title":"Protein\u2013protein interaction databases","volume":"1278","author":"Szklarczyk","year":"2015","journal-title":"Methods Mol. Biol"},{"key":"2023062312020443700_btaa164-B38","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1007\/978-3-319-07692-8_34","volume-title":"Recent Advances on Soft Computing and Data Mining","author":"Tang","year":"2014"},{"key":"2023062312020443700_btaa164-B39","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.clbc.2016.07.015","article-title":"Mechanisms that increase stability of estrogen receptor alpha in breast cancer","volume":"17","author":"Tecalco-Cruz","year":"2017","journal-title":"Clin. Breast Cancer"},{"key":"2023062312020443700_btaa164-B40","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B (Methodol.)"},{"key":"2023062312020443700_btaa164-B41","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1093\/bioinformatics\/17.6.520","article-title":"Missing value estimation methods for DNA microarrays","volume":"17","author":"Troyanskaya","year":"2001","journal-title":"Bioinformatics"},{"key":"2023062312020443700_btaa164-B42","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1016\/j.procs.2015.03.178","article-title":"Gene expression data classification using support vector machine and mutual information-based gene selection","volume":"47","author":"Vanitha","year":"2015","journal-title":"Procedia Comput. Sci"},{"first-page":"744","year":"2011","author":"Vens","key":"2023062312020443700_btaa164-B43"},{"key":"2023062312020443700_btaa164-B44","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1093\/nar\/gks494","article-title":"DIANA miRPath v.2.0: investigating the combinatorial effect of microRNAs in pathways","volume":"40","author":"Vlachos","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023062312020443700_btaa164-B45","doi-asserted-by":"crossref","DOI":"10.3390\/ijms19103028","article-title":"Role of extracellular matrix in development and cancer progression","volume":"19","author":"Walker","year":"2018","journal-title":"Int. J. Mol. Sci"},{"key":"2023062312020443700_btaa164-B46","doi-asserted-by":"crossref","first-page":"2185","DOI":"10.2147\/OTT.S157058","article-title":"The role of Hippo signal pathway in breast cancer metastasis","volume":"11","author":"Wei","year":"2018","journal-title":"Onco Targets Ther"},{"year":"2020","author":"Wu","key":"2023062312020443700_btaa164-B47"},{"key":"2023062312020443700_btaa164-B48","doi-asserted-by":"crossref","first-page":"1930","DOI":"10.1093\/bioinformatics\/btp291","article-title":"apLCMS\u2014adaptive processing of high-resolution LC\/MS data","volume":"25","author":"Yu","year":"2009","journal-title":"Bioinformatics"},{"key":"2023062312020443700_btaa164-B49","doi-asserted-by":"crossref","first-page":"1419","DOI":"10.1021\/pr301053d","article-title":"Hybrid feature detection and information accumulation using high-resolution LC-MS metabolomics data","volume":"12","author":"Yu","year":"2013","journal-title":"J. Proteome Res"},{"key":"2023062312020443700_btaa164-B50","first-page":"197","article-title":"AMP-activated protein kinase and energy balance in breast cancer","volume":"9","author":"Zhao","year":"2017","journal-title":"Am. J. Transl. Res"},{"key":"2023062312020443700_btaa164-B51","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1214\/14-AOAS719","article-title":"A bayesian nonparametric mixture model for selecting genes and gene subnetworks","volume":"8","author":"Zhao","year":"2014","journal-title":"Ann. Appl. Stat"},{"key":"2023062312020443700_btaa164-B52","doi-asserted-by":"crossref","first-page":"S21","DOI":"10.1186\/1471-2105-10-S1-S21","article-title":"Network-based support vector machine for classification of microarray samples","volume":"10","author":"Zhu","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023062312020443700_btaa164-B53","doi-asserted-by":"crossref","first-page":"466","DOI":"10.3389\/fphys.2018.00466","article-title":"Biological roles of aberrantly expressed glycosphingolipids and related enzymes in human cancer development and progression","volume":"9","author":"Zhuo","year":"2018","journal-title":"Front. Physiol"},{"key":"2023062312020443700_btaa164-B54","doi-asserted-by":"crossref","first-page":"899","DOI":"10.3892\/mmr.2016.6094","article-title":"AMPK activators suppress breast cancer cell growth by inhibiting DVL3-facilitated Wnt\/\u03b2-catenin signaling pathway activity","volume":"15","author":"Zou","year":"2017","journal-title":"Mol. Med. Rep"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa164\/33129339\/btaa164.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/11\/3507\/50670913\/bioinformatics_36_11_3507.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/11\/3507\/50670913\/bioinformatics_36_11_3507.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,1]],"date-time":"2024-08-01T21:14:07Z","timestamp":1722546847000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/11\/3507\/5803642"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,3,12]]},"references-count":54,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2020,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa164","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2020,6]]},"published":{"date-parts":[[2020,3,12]]}}}