{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T15:25:55Z","timestamp":1772205955661,"version":"3.50.1"},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2023,6,9]],"date-time":"2023-06-09T00:00:00Z","timestamp":1686268800000},"content-version":"vor","delay-in-days":8,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union\u2019s Horizon 2020 research and innovation program","award":["101016233"],"award-info":[{"award-number":["101016233"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>With the exponential growth of expression and protein\u2013protein interaction (PPI) data, the identification of functional modules in PPI networks that show striking changes in molecular activity or phenotypic signatures becomes of particular interest to reveal process-specific information that is correlated with cellular or disease states. This requires both the identification of network nodes with reliability scores and the availability of an efficient technique to locate the network regions with the highest scores. In the literature, a number of heuristic methods have been suggested. We propose SEMtree(), a set of tree-based structure discovery algorithms, combining graph and statistically interpretable parameters together with a user-friendly R package based on structural equation models framework.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Condition-specific changes from differential expression and gene\u2013gene co-expression are recovered with statistical testing of node, directed edge, and directed path difference between groups. In the end, from a list of seed (i.e. disease) genes or gene P-values, the perturbed modules with undirected edges are generated with five state-of-the-art active subnetwork detection methods. The latter are supplied to causal additive trees based on Chu\u2013Liu\u2013Edmonds\u2019 algorithm (Chow and Liu, Approximating discrete probability distributions with dependence trees. IEEE Trans Inform Theory 1968;14:462\u20137) in SEMtree() to be converted in directed trees. This conversion allows to compare the methods in terms of directed active subnetworks. We applied SEMtree() to both Coronavirus disease (COVID-19) RNA-seq dataset (GEO accession: GSE172114) and simulated datasets with various differential expression patterns. Compared to existing methods, SEMtree() is able to capture biologically relevant subnetworks with simple visualization of directed paths, good perturbation extraction, and classifier performance.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>SEMtree() function is implemented in the R package SEMgraph, easily available at https:\/\/CRAN.R-project.org\/package=SEMgraph.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad377","type":"journal-article","created":{"date-parts":[[2023,6,9]],"date-time":"2023-06-09T18:05:21Z","timestamp":1686333921000},"source":"Crossref","is-referenced-by-count":8,"title":["SEMtree: tree-based structure learning methods with structural equation models"],"prefix":"10.1093","volume":"39","author":[{"given":"Mario","family":"Grassi","sequence":"first","affiliation":[{"name":"Department of Brain and Behavioral Sciences, University of Pavia , Pavia 27100, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6384-2561","authenticated-orcid":false,"given":"Barbara","family":"Tarantino","sequence":"additional","affiliation":[{"name":"Department of Brain and Behavioral Sciences, University of Pavia , Pavia 27100, Italy"}]}],"member":"286","published-online":{"date-parts":[[2023,6,9]]},"reference":[{"key":"2023062303470394300_btad377-B1","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1016\/j.econmod.2019.11.005","article-title":"Tree networks to assess financial contagion","volume":"85","author":"Agosto","year":"2020","journal-title":"Econ Model"},{"key":"2023062303470394300_btad377-B2","doi-asserted-by":"crossref","first-page":"112","DOI":"10.1016\/j.physa.2019.01.130","article-title":"Latent factor models for credit scoring in p2p systems","volume":"522","author":"Ahelegbey","year":"2019","journal-title":"Phys A Stat Mech Appl"},{"key":"2023062303470394300_btad377-B3","doi-asserted-by":"crossref","first-page":"155","DOI":"10.3389\/fcell.2018.00155","article-title":"Autophagy-virus interplay: from cell biology to human disease","volume":"6","author":"Ahmad","year":"2018","journal-title":"Front Cell Dev Biol"},{"key":"2023062303470394300_btad377-B4","first-page":"482","article-title":"A novel pathway analysis approach based on the unexplained disregulation of genes","volume":"105","author":"Ansari","year":"2017","journal-title":"Proc IEEE"},{"key":"2023062303470394300_btad377-B5","doi-asserted-by":"crossref","first-page":"556","DOI":"10.3390\/biomedicines9050556","article-title":"Predicting COVID-19-comorbidity pathway crosstalk-based targets and drugs: towards personalized COVID-19 management","volume":"9","author":"Barh","year":"2021","journal-title":"Biomedicines"},{"key":"2023062303470394300_btad377-B6","doi-asserted-by":"crossref","first-page":"1129","DOI":"10.1093\/bioinformatics\/btq089","article-title":"BioNet: an R-package for the functional analysis of biological networks","volume":"26","author":"Beisser","year":"2010","journal-title":"Bioinformatics"},{"key":"2023062303470394300_btad377-B7","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J R Stat Soc Ser B"},{"key":"2023062303470394300_btad377-B8","doi-asserted-by":"crossref","first-page":"1075","DOI":"10.1198\/jasa.2011.tm10183","article-title":"Hierarchical clustering with prototypes via minimax linkage","volume":"106","author":"Bien","year":"2011","journal-title":"J Am Stat Assoc"},{"key":"2023062303470394300_btad377-B9","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach Learn"},{"key":"2023062303470394300_btad377-B10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1126\/scitranslmed.abj7521","article-title":"Identification of driver genes for critical forms of COVID-19 in a deeply phenotyped young patient cohort","volume":"14","author":"Carapito","year":"2022","journal-title":"Sci Transl Med"},{"key":"2023062303470394300_btad377-B11","author":"Chatterjee","year":"2022"},{"key":"2023062303470394300_btad377-B12","doi-asserted-by":"crossref","first-page":"462","DOI":"10.1109\/TIT.1968.1054142","article-title":"Approximating discrete probability distributions with dependence trees","volume":"14","author":"Chow","year":"1968","journal-title":"IEEE Trans Inform Theory"},{"key":"2023062303470394300_btad377-B13","doi-asserted-by":"crossref","first-page":"S249","DOI":"10.33549\/physiolres.934725","article-title":"Could the CCR5-Delta32 mutation be protective in SARS-CoV-2 infection?","volume":"70","author":"\u010cizmarevi\u0107","year":"2021","journal-title":"Physiol Res"},{"key":"2023062303470394300_btad377-B14","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1007\/s10479-019-03282-3","article-title":"Crypto price discovery through correlation networks","volume":"299","author":"Giudici","year":"2021","journal-title":"Ann Oper Res"},{"key":"2023062303470394300_btad377-B15","doi-asserted-by":"crossref","first-page":"344","DOI":"10.1186\/s12859-022-04884-8","article-title":"SEMgsa: topology-based pathway enrichment analysis with structural equation models","volume":"23","author":"Grassi","year":"2022","journal-title":"BMC Bioinformatics"},{"key":"2023062303470394300_btad377-B16","doi-asserted-by":"crossref","first-page":"4829","DOI":"10.1093\/bioinformatics\/btac567","article-title":"SEMgraph: an R package for causal network inference of high-throughput data with structural equation models","volume":"38","author":"Grassi","year":"2022","journal-title":"Bioinformatics"},{"key":"2023062303470394300_btad377-B17","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1186\/1752-0509-4-47","article-title":"Identification of responsive gene modules by network-based gene clustering and extending: application to inflammation and angiogenesis","volume":"4","author":"Gu","year":"2010","journal-title":"BMC Syst Biol"},{"key":"2023062303470394300_btad377-B18","doi-asserted-by":"crossref","first-page":"371","DOI":"10.1146\/annurev-statistics-031017-100630","article-title":"Causal structure learning","volume":"5","author":"Heinze-Deml","year":"2018","journal-title":"Annu Rev Stat Appl"},{"key":"2023062303470394300_btad377-B19","doi-asserted-by":"crossref","first-page":"S233","DOI":"10.1093\/bioinformatics\/18.suppl_1.S233","article-title":"Discovering regulatory and signalling circuits in molecular interaction networks","volume":"18","author":"Ideker","year":"2002","journal-title":"Bioinformatics"},{"key":"2023062303470394300_btad377-B20","first-page":"1","article-title":"Structure learning for directed trees","volume":"23","author":"Jakobsen","year":"2022","journal-title":"J Mach Learn Res"},{"key":"2023062303470394300_btad377-B21","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/nar\/28.1.27","article-title":"KEGG: Kyoto encyclopedia of genes and genomes","volume":"28","author":"Kanehisa","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023062303470394300_btad377-B22","volume-title":"Algorithm Design","author":"Kleinberg"},{"key":"2023062303470394300_btad377-B23","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1007\/BF00288961","article-title":"A fast algorithm for Steiner trees","volume":"15","author":"Kou","year":"1981","journal-title":"Acta Inform"},{"key":"2023062303470394300_btad377-B24","doi-asserted-by":"crossref","DOI":"10.15252\/msb.202110396","article-title":"SARS-CoV-2\u2013host proteome interactions for antiviral drug discovery","volume":"17","author":"Liu","year":"2021","journal-title":"Mol Syst Biol"},{"key":"2023062303470394300_btad377-B25","author":"Lou"},{"key":"2023062303470394300_btad377-B26","doi-asserted-by":"crossref","first-page":"1290","DOI":"10.1093\/bioinformatics\/btr136","article-title":"COSINE: COndition-SpecIfic Sub-NEtwork identification using a global optimization method","volume":"27","author":"Ma","year":"2011","journal-title":"Bioinformatics"},{"key":"2023062303470394300_btad377-B27","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1038\/nrg3552","article-title":"Integrative approaches for finding modular structure in biological networks","volume":"14","author":"Mitra","year":"2013","journal-title":"Nat Rev Genet"},{"key":"2023062303470394300_btad377-B28","doi-asserted-by":"crossref","DOI":"10.3389\/fgene.2019.00155","article-title":"A comprehensive survey of tools and software for active subnetwork identification","volume":"10","author":"Nguyen","year":"2019","journal-title":"Front Genet"},{"key":"2023062303470394300_btad377-B29","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1186\/1756-0381-6-17","article-title":"Using random walks to identify cancer-associated modules in expression data","volume":"6","author":"Petrochilos","year":"2013","journal-title":"BioData Min"},{"key":"2023062303470394300_btad377-B30","doi-asserted-by":"crossref","first-page":"1389","DOI":"10.1002\/j.1538-7305.1957.tb01515.x","article-title":"Shortest connection networks and some generalizations","volume":"36","author":"Prim","year":"1957","journal-title":"Bell Syst Tech J"},{"key":"2023062303470394300_btad377-B31","first-page":"397","volume-title":"limma: Linear Models for Microarray Data","author":"Smyth","year":"2005"},{"key":"2023062303470394300_btad377-B32","doi-asserted-by":"crossref","DOI":"10.1038\/s41598-021-03309-5","article-title":"Identification of transcriptional regulatory network associated with response of host epithelial cells to SARS-CoV-2","volume":"11","author":"Su","year":"2021","journal-title":"Sci Rep"},{"key":"2023062303470394300_btad377-B33","first-page":"1","article-title":"The role of autophagy and nlrp3 inflammasome in liver fibrosis","volume":"2020","author":"Tao","year":"2020","journal-title":"BioMed Res Int"},{"key":"2023062303470394300_btad377-B34","first-page":"1960","volume-title":"Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence","author":"Tramontano","year":"2022"},{"key":"2023062303470394300_btad377-B35","doi-asserted-by":"crossref","first-page":"858","DOI":"10.3389\/fgene.2019.00858","article-title":"pathfindR: an R package for comprehensive identification of enriched pathways in omics data through active subnetworks","volume":"10","author":"Ulgen","year":"2019","journal-title":"Front Genet"},{"key":"2023062303470394300_btad377-B36","doi-asserted-by":"crossref","first-page":"986","DOI":"10.1016\/j.cell.2011.02.016","article-title":"Interactome networks and human disease","volume":"144","author":"Vidal","year":"2011","journal-title":"Cell"},{"key":"2023062303470394300_btad377-B37","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1111\/j.1467-9868.2011.00783.x","article-title":"Penalized classification using Fisher\u2019s linear discriminant","volume":"73","author":"Witten","year":"2011","journal-title":"J R Stat Soc Ser B Stat Methodol"},{"key":"2023062303470394300_btad377-B38","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1016\/j.ygeno.2011.12.005","article-title":"GenRev: exploring functional relevance of genes in molecular networks","volume":"99","author":"Zheng","year":"2012","journal-title":"Genomics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad377\/50563070\/btad377.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/6\/btad377\/50683389\/btad377.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/6\/btad377\/50683389\/btad377.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,22]],"date-time":"2024-10-22T00:04:57Z","timestamp":1729555497000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad377\/7192988"}},"subtitle":[],"editor":[{"given":"Anthony","family":"Mathelier","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,6,1]]},"references-count":38,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad377","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,6,1]]},"published":{"date-parts":[[2023,6,1]]},"article-number":"btad377"}}