{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T07:49:56Z","timestamp":1770536996969,"version":"3.49.0"},"reference-count":34,"publisher":"Oxford University Press (OUP)","issue":"22","license":[{"start":{"date-parts":[[2021,5,27]],"date-time":"2021-05-27T00:00:00Z","timestamp":1622073600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["HL155821"],"award-info":[{"award-number":["HL155821"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["MH107571"],"award-info":[{"award-number":["MH107571"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["HL116656"],"award-info":[{"award-number":["HL116656"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["HL115227"],"award-info":[{"award-number":["HL115227"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["AG051981"],"award-info":[{"award-number":["AG051981"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,11,18]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Traditional regression models are limited in outcome prediction due to their parametric nature. Current deep learning methods allow for various effects and interactions and have shown improved performance, but they typically need to be trained on a large amount of data to obtain reliable results. Gene expression studies often have small sample sizes but high dimensional correlated predictors so that traditional deep learning methods are not readily applicable.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>In this article, we proposed peel learning, a novel neural network that incorporates the prior relationship among genes. In each layer of learning, overall structure is peeled into multiple local substructures. Within the substructure, dependency among variables is reduced through linear projections. The overall structure is gradually simplified over layers and weight parameters are optimized through a revised backpropagation. We applied PL to a small lung transplantation study to predict recipients\u2019 post-surgery primary graft dysfunction using donors\u2019 gene expressions within several immunology pathways, where PL showed improved prediction accuracy compared to conventional penalized regression, classification trees, feed-forward neural network and a neural network assuming prior network structure. Through simulation studies, we also demonstrated the advantage of adding specific structure among predictor variables in neural network, over no or uniform group structure, which is more favorable in smaller studies. The empirical evidence is consistent with our theoretical proof of improved upper bound of PL\u2019s complexity over ordinary neural networks.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>PL algorithm was implemented in Python and the open-source code and instruction will be available at https:\/\/github.com\/Likelyt\/Peel-Learning.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab402","type":"journal-article","created":{"date-parts":[[2021,5,27]],"date-time":"2021-05-27T03:16:37Z","timestamp":1622085397000},"page":"4108-4114","source":"Crossref","is-referenced-by-count":2,"title":["Peel learning for pathway-related outcome prediction"],"prefix":"10.1093","volume":"37","author":[{"given":"Yuantong","family":"Li","sequence":"first","affiliation":[{"name":"Department of Statistics, Purdue University , West Lafayette, IN 47907, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fei","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Healthcare Policy and Research, Cornell University Weill Medical School , New York, NY 10065, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mengying","family":"Yan","sequence":"additional","affiliation":[{"name":"Department of Biostatistics and Bioinformatics, Duke University , Durham, NC 27710, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Edward","family":"Cantu III","sequence":"additional","affiliation":[{"name":"Department of Surgery, University of Pennsylvania , Philadelphia, PA 19104, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fan Nils","family":"Yang","sequence":"additional","affiliation":[{"name":"Department of Neuroscience, Georgetown University , Washington, DC 20057, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hengyi","family":"Rao","sequence":"additional","affiliation":[{"name":"Department of Neurology, University of Pennsylvania , Philadelphia, PA 19104, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4151-7228","authenticated-orcid":false,"given":"Rui","family":"Feng","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania , Philadelphia, PA 19104, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2021,5,27]]},"reference":[{"key":"2023051607095035300_btab402-B1","first-page":"110","article-title":"Feature selection for clustering: a review","volume":"29","author":"Alelyani","year":"2013","journal-title":"Data Cluster. Algorithms Appl"},{"key":"2023051607095035300_btab402-B2","doi-asserted-by":"crossref","first-page":"2140","DOI":"10.1111\/j.1600-6143.2008.02354.x","article-title":"Impact of human donor lung gene expression profiles on survival after lung transplantation: a case-control study","volume":"8","author":"Anraku","year":"2008","journal-title":"Am. J. Transplant"},{"key":"2023051607095035300_btab402-B3","doi-asserted-by":"crossref","first-page":"1046","DOI":"10.1164\/rccm.201912-2436LE","article-title":"Pre-procurement in situ donor lung tissue gene expression classifies primary graft dysfunction risk","volume":"202","author":"Cantu","year":"2020","journal-title":"Am. J. Respir. Critical Care Med"},{"key":"2023051607095035300_btab402-B4","first-page":"785","author":"Chen","year":"2016"},{"key":"2023051607095035300_btab402-B5","doi-asserted-by":"crossref","first-page":"1451","DOI":"10.1016\/j.healun.2005.03.004","article-title":"Report of the ISHLT working group on primary lung graft dysfunction part I: introduction and methods","volume":"24","author":"Christie","year":"2005","journal-title":"J. Heart Lung Transplant"},{"key":"2023051607095035300_btab402-B6","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1016\/j.healun.2010.05.013","article-title":"Construct validity of the definition of primary graft dysfunction after lung transplantation","volume":"29","author":"Christie","year":"2010","journal-title":"J. Heart Lung Transplant"},{"key":"2023051607095035300_btab402-B7","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1159\/000152448","article-title":"A general model for the genetic analysis of pedigree data","volume":"21","author":"Elston","year":"1971","journal-title":"Hum. Hered"},{"key":"2023051607095035300_btab402-B8","doi-asserted-by":"crossref","first-page":"916","DOI":"10.1214\/07-AOAS148","article-title":"Predictive learning via rule ensembles","volume":"2","author":"Friedman","year":"2008","journal-title":"Ann. Appl. Stat"},{"key":"2023051607095035300_btab402-B9","volume-title":"The Elements of Statistical Learning","author":"Friedman","year":"2001"},{"key":"2023051607095035300_btab402-B10","doi-asserted-by":"crossref","first-page":"2414","DOI":"10.1093\/nar\/gkr1110","article-title":"Gene array analyzer: alternative usage of gene arrays to study alternative splicing events","volume":"40","author":"Gellert","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023051607095035300_btab402-B11","volume-title":"Deep Learning","author":"Goodfellow","year":"2016"},{"key":"2023051607095035300_btab402-B12","doi-asserted-by":"crossref","first-page":"510","DOI":"10.1186\/s12859-018-2500-z","article-title":"Pasnet: pathway-associated sparse deep neural network for prognosis prediction from high-throughput data","volume":"19","author":"Hao","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023051607095035300_btab402-B13","first-page":"926","article-title":"A practical guide to training restricted Boltzmann machines","volume":"9","author":"Hinton","year":"2010","journal-title":"Momentum"},{"key":"2023051607095035300_btab402-B14","volume-title":"Differential Equations, Dynamical Systems, and Linear Algebra","author":"Hirsch","year":"1974"},{"key":"2023051607095035300_btab402-B15","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1016\/j.jbi.2014.11.013","article-title":"Stable feature selection for clinical prediction: exploiting ICD tree structure using tree-lasso","volume":"53","author":"Kamkar","year":"2015","journal-title":"J. Biomed. Inf"},{"key":"2023051607095035300_btab402-B16","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/nar\/28.1.27","article-title":"KEGG: Kyoto Encyclopedia of Genes and Genomes","volume":"28","author":"Kanehisa","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023051607095035300_btab402-B17","doi-asserted-by":"crossref","first-page":"D355","DOI":"10.1093\/nar\/gkp896","article-title":"KEGG for representation and analysis of molecular networks involving diseases and drugs","volume":"38","author":"Kanehisa","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023051607095035300_btab402-B18","first-page":"3727","article-title":"A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data","volume":"34","author":"Kong","year":"2018","journal-title":"Ann. Appl. Stat"},{"key":"2023051607095035300_btab402-B19","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"nature"},{"key":"2023051607095035300_btab402-B20","first-page":"pp. 2287","author":"Liu","year":"2017"},{"key":"2023051607095035300_btab402-B21","doi-asserted-by":"crossref","first-page":"1079","DOI":"10.1111\/j.1541-0420.2007.00799.x","article-title":"Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models","volume":"63","author":"Liu","year":"2007","journal-title":"Biometrics"},{"key":"2023051607095035300_btab402-B22","first-page":"1459","article-title":"Moreau-Yosida regularization for grouped tree structure learning","author":"Liu","year":"2010"},{"key":"2023051607095035300_btab402-B23","first-page":"487","author":"Liu","year":"2011"},{"key":"2023051607095035300_btab402-B24","author":"Romero","year":"2016"},{"key":"2023051607095035300_btab402-B25","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/j.neucom.2017.02.029","article-title":"Group sparse regularization for deep neural networks","volume":"241","author":"Scardapane","year":"2017","journal-title":"Neurocomputing"},{"key":"2023051607095035300_btab402-B26","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1016\/j.neunet.2014.09.003","article-title":"Deep learning in neural networks: an overview","volume":"61","author":"Schmidhuber","year":"2015","journal-title":"Neural Netw"},{"key":"2023051607095035300_btab402-B27","first-page":"1929","article-title":"Dropout: a simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res"},{"key":"2023051607095035300_btab402-B28","author":"Tartaglione","year":"2018"},{"key":"2023051607095035300_btab402-B29","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B (Methodological)"},{"key":"2023051607095035300_btab402-B30","doi-asserted-by":"crossref","first-page":"1986","DOI":"10.1093\/bioinformatics\/btr300","article-title":"Classification with correlated features: unreliability of feature ranking and solutions","volume":"27","author":"Tolosi","year":"2011","journal-title":"Bioinformatics"},{"key":"2023051607095035300_btab402-B31","author":"Wu","year":"2017"},{"key":"2023051607095035300_btab402-B32","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.ajhg.2011.05.029","article-title":"Rare variant association testing for sequencing data with the sequence kernel association test (skat)","volume":"89","author":"Wu","year":"2011","journal-title":"Am. J. Hum. Genet"},{"key":"2023051607095035300_btab402-B33","author":"Zhang","year":"2017"},{"key":"2023051607095035300_btab402-B34","doi-asserted-by":"crossref","first-page":"3468","DOI":"10.1214\/07-AOS584","article-title":"The composite absolute penalties family for grouped and hierarchical variable selection","volume":"37","author":"Zhao","year":"2009","journal-title":"Ann. Stat"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab402\/38934281\/btab402.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/22\/4108\/50335158\/btab402.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/22\/4108\/50335158\/btab402.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,31]],"date-time":"2024-08-31T16:45:52Z","timestamp":1725122752000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/22\/4108\/6286960"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2021,5,27]]},"references-count":34,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2021,11,18]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab402","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,11,15]]},"published":{"date-parts":[[2021,5,27]]}}}