{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T12:43:27Z","timestamp":1775047407782,"version":"3.50.1"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"21","license":[{"start":{"date-parts":[[2022,9,8]],"date-time":"2022-09-08T00:00:00Z","timestamp":1662595200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"crossref","award":["SCHW 1768\/1-1"],"award-info":[{"award-number":["SCHW 1768\/1-1"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"crossref"}]},{"name":"German Federal Ministry of Education and Research (BMBF"},{"name":"eMed COMMITMENT","award":["01KU1905A"],"award-info":[{"award-number":["01KU1905A"]}]},{"name":"eMed COMMITMENT","award":["01ZX1904A"],"award-info":[{"award-number":["01ZX1904A"]}]},{"name":"European Union\u2019s Horizon 2020 research and innovation program under grant agreements","award":["826078"],"award-info":[{"award-number":["826078"]}]},{"name":"European Union\u2019s Horizon 2020 research and innovation program under grant agreements","award":["777111"],"award-info":[{"award-number":["777111"]}]},{"name":"HBCC dataset used in this study (dbGAP","award":["phs000979.v3.p2"],"award-info":[{"award-number":["phs000979.v3.p2"]}]},{"name":"Intramural Research Program of the NIMH","award":["NCT00001260"],"award-info":[{"award-number":["NCT00001260"]}]},{"name":"Intramural Research Program of the NIMH","award":["900142"],"award-info":[{"award-number":["900142"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,10,31]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>In multi-cohort machine learning studies, it is critical to differentiate between effects that are reproducible across cohorts and those that are cohort-specific. Multi-task learning (MTL) is a machine learning approach that facilitates this differentiation through the simultaneous learning of prediction tasks across cohorts. Since multi-cohort data can often not be combined into a single storage solution, there would be the substantial utility of an MTL application for geographically distributed data sources.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Here, we describe the development of \u2018dsMTL\u2019, a computational framework for privacy-preserving, distributed multi-task machine learning that includes three supervised and one unsupervised algorithms. First, we derive the theoretical properties of these methods and the relevant machine learning workflows to ensure the validity of the software implementation. Second, we implement dsMTL as a library for the R programming language, building on the DataSHIELD platform that supports the federated analysis of sensitive individual-level data. Third, we demonstrate the applicability of dsMTL for comorbidity modeling in distributed data. We show that comorbidity modeling using dsMTL outperformed conventional, federated machine learning, as well as the aggregation of multiple models built on the distributed datasets individually. The application of dsMTL was computationally efficient and highly scalable when applied to moderate-size (n\u2009&amp;lt;\u2009500), real expression data given the actual network latency.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>dsMTL is freely available at https:\/\/github.com\/transbioZI\/dsMTLBase (server-side package) and https:\/\/github.com\/transbioZI\/dsMTLClient (client-side package).<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac616","type":"journal-article","created":{"date-parts":[[2022,9,8]],"date-time":"2022-09-08T09:30:20Z","timestamp":1662629420000},"page":"4919-4926","source":"Crossref","is-referenced-by-count":8,"title":["dsMTL: a computational framework for privacy-preserving, distributed multi-task machine learning"],"prefix":"10.1093","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5226-218X","authenticated-orcid":false,"given":"Han","family":"Cao","sequence":"first","affiliation":[{"name":"Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University , Mannheim 68158, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5998-1363","authenticated-orcid":false,"given":"Youcheng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Health Data Science Unit, Medical Faculty Heidelberg & BioQuant , Heidelberg 69120, Germany"}]},{"given":"Jan","family":"Baumbach","sequence":"additional","affiliation":[{"name":"Chair of Computational Systems Biology, University of Hamburg , Hamburg 22607, Germany"},{"name":"Computational Biomedicine Lab, Department of Mathematics and Computer Science, University of Southern Denmark , Odense 5230, Denmark"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5799-9634","authenticated-orcid":false,"given":"Paul R","family":"Burton","sequence":"additional","affiliation":[{"name":"Population Health Sciences Institute, Newcastle University , Newcastle upon Tyne NE2 4AX, UK"}]},{"given":"Dominic","family":"Dwyer","sequence":"additional","affiliation":[{"name":"Department of Psychiatry and Psychotherapy, Section for Neurodiagnostic Applications, Ludwig-Maximilian University , Munich 80638, Germany"}]},{"given":"Nikolaos","family":"Koutsouleris","sequence":"additional","affiliation":[{"name":"Department of Psychiatry and Psychotherapy, Section for Neurodiagnostic Applications, Ludwig-Maximilian University , Munich 80638, Germany"}]},{"given":"Julian","family":"Matschinske","sequence":"additional","affiliation":[{"name":"Chair of Computational Systems Biology, University of Hamburg , Hamburg 22607, Germany"}]},{"given":"Yannick","family":"Marcon","sequence":"additional","affiliation":[{"name":"Epigeny , St Ouen, France"}]},{"given":"Sivanesan","family":"Rajan","sequence":"additional","affiliation":[{"name":"Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University , Mannheim 68158, Germany"}]},{"given":"Thilo","family":"Rieg","sequence":"additional","affiliation":[{"name":"Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University , Mannheim 68158, Germany"}]},{"given":"Patricia","family":"Ryser-Welch","sequence":"additional","affiliation":[{"name":"Population Health Sciences Institute, Newcastle University , Newcastle upon Tyne NE2 4AX, UK"}]},{"given":"Julian","family":"Sp\u00e4th","sequence":"additional","affiliation":[{"name":"Chair of Computational Systems Biology, University of Hamburg , Hamburg 22607, Germany"}]},{"name":"The COMMITMENT Consortium","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4989-4722","authenticated-orcid":false,"given":"Carl","family":"Herrmann","sequence":"additional","affiliation":[{"name":"Health Data Science Unit, Medical Faculty Heidelberg & BioQuant , Heidelberg 69120, Germany"}]},{"given":"Emanuel","family":"Schwarz","sequence":"additional","affiliation":[{"name":"Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University , Mannheim 68158, Germany"}]}],"member":"286","published-online":{"date-parts":[[2022,9,8]]},"reference":[{"key":"2022103112474013800_btac616-B1","doi-asserted-by":"crossref","first-page":"5205","DOI":"10.1093\/bioinformatics\/btaa641","article-title":"Identifying disease-causing mutations with privacy protection","volume":"36","author":"Akgun","year":"2021","journal-title":"Bioinformatics"},{"key":"2022103112474013800_btac616-B2","doi-asserted-by":"crossref","first-page":"2202","DOI":"10.1093\/bioinformatics\/btac070","article-title":"Efficient privacy-preserving whole genome variant queries","volume":"38","author":"Akgun","year":"2022","journal-title":"Bioinformatics"},{"key":"2022103112474013800_btac616-B3","doi-asserted-by":"crossref","first-page":"3387","DOI":"10.3390\/ijms19113387","article-title":"Comparative evaluation of machine learning strategies for analyzing big data in psychiatry","volume":"19","author":"Cao","year":"2018","journal-title":"Int. J. Mol. Sci"},{"key":"2022103112474013800_btac616-B4","doi-asserted-by":"crossref","first-page":"1797","DOI":"10.1093\/bioinformatics\/bty831","article-title":"RMTL: an R library for multi-task learning","volume":"35","author":"Cao","year":"2019","journal-title":"Bioinformatics"},{"key":"2022103112474013800_btac616-B5","author":"Consotia","year":"2019"},{"key":"2022103112474013800_btac616-B6","doi-asserted-by":"crossref","first-page":"210091","DOI":"10.1098\/rsob.210091","article-title":"Emerging evidence implicating a role for neurexins in neurodegenerative and neuropsychiatric disorders","volume":"11","author":"Cuttler","year":"2021","journal-title":"Open Biol"},{"key":"2022103112474013800_btac616-B7","author":"Dahl","year":"2018"},{"key":"2022103112474013800_btac616-B8","first-page":"17","author":"Fredrikson","year":"2014"},{"key":"2022103112474013800_btac616-B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization paths for generalized linear models via coordinate descent","volume":"33","author":"Friedman","year":"2010","journal-title":"J. Stat. Softw"},{"key":"2022103112474013800_btac616-B10","doi-asserted-by":"crossref","first-page":"9743","DOI":"10.1038\/s41598-018-28066-w","article-title":"Biomarker discovery by integrated joint non-negative matrix factorization and pathway signature analyses","volume":"8","author":"Fujita","year":"2018","journal-title":"Sci. Rep"},{"key":"2022103112474013800_btac616-B11","doi-asserted-by":"crossref","first-page":"1929","DOI":"10.1093\/ije\/dyu188","article-title":"DataSHIELD: taking the analysis to the data, not the data to the analysis","volume":"43","author":"Gaye","year":"2014","journal-title":"Int. J. Epidemiol"},{"key":"2022103112474013800_btac616-B12","author":"Hu","year":"2021"},{"key":"2022103112474013800_btac616-B13","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1016\/j.neuroimage.2013.04.061","article-title":"Multi-site genetic analysis of diffusion images and voxelwise heritability analysis: a pilot project of the ENIGMA\u2013DTI working group","volume":"81","author":"Jahanshad","year":"2013","journal-title":"Neuroimage"},{"key":"2022103112474013800_btac616-B14","doi-asserted-by":"crossref","first-page":"136","DOI":"10.1016\/j.neuroimage.2014.03.033","article-title":"Multi-site study of additive genetic effects on fractional anisotropy of cerebral white matter: comparing meta and megaanalytical approaches for data pooling","volume":"95","author":"Kochunov","year":"2014","journal-title":"Neuroimage"},{"key":"2022103112474013800_btac616-B15","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1002\/1096-8628(20010108)105:1<99::AID-AJMG1071>3.0.CO;2-U","article-title":"An association study between polymorphisms of L1CAM gene and schizophrenia in a Japanese sample","volume":"105","author":"Kurumaji","year":"2001","journal-title":"Am. J. Med. Genet"},{"key":"2022103112474013800_btac616-B16","first-page":"50","article-title":"Federated learning: challenges, methods, and future directions","volume":"37","author":"Li","year":"2020","journal-title":"IEEE Signal Process. Mag"},{"key":"2022103112474013800_btac616-B17","doi-asserted-by":"crossref","first-page":"234","DOI":"10.1016\/S0140-6736(09)60072-6","article-title":"Common genetic determinants of schizophrenia and bipolar disorder in Swedish families: a population-based study","volume":"373","author":"Lichtenstein","year":"2009","journal-title":"Lancet"},{"key":"2022103112474013800_btac616-B18","author":"Matschinske","year":"2021"},{"key":"2022103112474013800_btac616-B19","doi-asserted-by":"crossref","first-page":"414","DOI":"10.3389\/fphar.2017.00414","article-title":"The emerging role for zinc in depression and psychosis","volume":"8","author":"Petrilli","year":"2017","journal-title":"Front. Pharmacol"},{"key":"2022103112474013800_btac616-B20","doi-asserted-by":"crossref","first-page":"bpaa022","DOI":"10.1093\/biomethods\/bpaa022","article-title":"ShinyButchR: interactive NMF-based decomposition workflow of genome-scale datasets","volume":"5","author":"Quintero","year":"2020","journal-title":"Biol. Methods Protoc"},{"key":"2022103112474013800_btac616-B21","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1038\/s41746-020-00323-1","article-title":"The future of digital health with federated learning","volume":"3","author":"Rieke","year":"2020","journal-title":"NPJ Digit. Med"},{"key":"2022103112474013800_btac616-B22","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1038\/nature13595","article-title":"Biological insights from 108 schizophrenia-associated genetic loci","volume":"511","author":"Schizophrenia Working Group of the Psychiatric Genomics Consortium","year":"2014","journal-title":"Nature"},{"key":"2022103112474013800_btac616-B23","doi-asserted-by":"crossref","first-page":"34","DOI":"10.23861\/EJBM201631752","article-title":"Autophagy and schizophrenia: a closer look at how dysregulation of neuronal cell homeostasis influences the pathogenesis of schizophrenia","volume":"31","author":"Schneider","year":"2016","journal-title":"Einstein J. Biol. Med"},{"key":"2022103112474013800_btac616-B24","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. B Methodol"},{"key":"2022103112474013800_btac616-B25","author":"Warnat-Herresthal","year":"2020"},{"key":"2022103112474013800_btac616-B26","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1038\/s41586-021-03583-3","article-title":"Swarm learning for decentralized and confidential clinical machine learning","volume":"594","author":"Warnat-Herresthal","year":"2021","journal-title":"Nature"},{"key":"2022103112474013800_btac616-B27","doi-asserted-by":"crossref","first-page":"1873","DOI":"10.1016\/j.cell.2019.05.006","article-title":"Single-Cell multi-omic integration compares and contrasts features of brain cell identity","volume":"177","author":"Welch","year":"2019","journal-title":"Cell"},{"key":"2022103112474013800_btac616-B28","article-title":"DataSHIELD\u2014new directions and dimensions","volume":"16, 21","author":"Wilson","year":"2017","journal-title":"Data Sci. J"},{"key":"2022103112474013800_btac616-B29","first-page":"1195","article-title":"Privacy-preserving distributed multi-task learning with asynchronous updates","author":"Xie","year":"2017"},{"key":"2022103112474013800_btac616-B30","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1504\/IJDMB.2011.043030","article-title":"Multi-platform gene-expression mining and marker gene analysis","volume":"5","author":"Xu","year":"2011","journal-title":"Int. J. Data Min. Bioinform"},{"key":"2022103112474013800_btac616-B31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/bioinformatics\/btv544","article-title":"A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data","volume":"32","author":"Yang","year":"2016","journal-title":"Bioinformatics"},{"key":"2022103112474013800_btac616-B32","doi-asserted-by":"crossref","first-page":"31619","DOI":"10.1038\/srep31619","article-title":"Multitask learning improves prediction of cancer drug sensitivity","volume":"6","author":"Yuan","year":"2016","journal-title":"Sci. Rep"},{"key":"2022103112474013800_btac616-B33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/1900000062","article-title":"Distributed learning systems with first-order methods","volume":"9","author":"Zhang","year":"2020","journal-title":"FNT. Databases"},{"key":"2022103112474013800_btac616-B34","doi-asserted-by":"crossref","DOI":"10.1145\/2020408.2020549","article-title":"A multi-task learning formulation for predicting disease progression","author":"Zhou","year":"2011"},{"key":"2022103112474013800_btac616-B35","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1016\/j.neuroimage.2013.03.073","article-title":"Modeling disease progression via multi-task learning","volume":"78","author":"Zhou","year":"2013","journal-title":"Neuroimage"},{"key":"2022103112474013800_btac616-B36","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1186\/s13059-021-02553-2","article-title":"Flimma: a federated and privacy-aware tool for differential gene expression analysis","volume":"22","author":"Zolotareva","year":"2021","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac616\/45907848\/btac616.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/21\/4919\/46697912\/btac616.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/21\/4919\/46697912\/btac616.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,3]],"date-time":"2024-10-03T07:38:14Z","timestamp":1727941094000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/21\/4919\/6694043"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,9,8]]},"references-count":36,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2022,9,8]]},"published-print":{"date-parts":[[2022,10,31]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac616","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.08.26.457778","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,11,1]]},"published":{"date-parts":[[2022,9,8]]}}}