{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T08:52:54Z","timestamp":1765356774123,"version":"3.37.3"},"reference-count":15,"publisher":"Oxford University Press (OUP)","issue":"Supplement_2","license":[{"start":{"date-parts":[[2020,12,1]],"date-time":"2020-12-01T00:00:00Z","timestamp":1606780800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Genome Canada Large Scale Application Research"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,12,30]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The ability to develop robust machine-learning (ML) models is considered imperative to the adoption of ML techniques in biology and medicine fields. This challenge is particularly acute when data available for training is not independent and identically distributed (iid), in which case trained models are vulnerable to out-of-distribution generalization problems. Of particular interest are problems where data correspond to observations made on phylogenetically related samples (e.g. antibiotic resistance data).<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We introduce DendroNet, a new approach to train neural networks in the context of evolutionary data. DendroNet explicitly accounts for the relatedness of the training\/testing data, while allowing the model to evolve along the branches of the phylogenetic tree, hence accommodating potential changes in the rules that relate genotypes to phenotypes. Using simulated data, we demonstrate that DendroNet produces models that can be significantly better than non-phylogenetically aware approaches. DendroNet also outperforms other approaches at two biological tasks of significant practical importance: antiobiotic resistance prediction in bacteria and trophic level prediction in fungi.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>https:\/\/github.com\/BlanchetteLab\/DendroNet.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa842","type":"journal-article","created":{"date-parts":[[2020,9,14]],"date-time":"2020-09-14T19:14:17Z","timestamp":1600110857000},"page":"i895-i902","source":"Crossref","is-referenced-by-count":7,"title":["Supervised learning on phylogenetically distributed data"],"prefix":"10.1093","volume":"36","author":[{"given":"Elliot","family":"Layne","sequence":"first","affiliation":[{"name":"School of Computer Science , McGill, Montreal, QC H3A 0E9, Canada"}]},{"given":"Erika N","family":"Dort","sequence":"additional","affiliation":[{"name":"Department of Forestry and Conservation Sciences, University of British Columbia , Vancouver, BC V6T 1Z4, Canada"}]},{"given":"Richard","family":"Hamelin","sequence":"additional","affiliation":[{"name":"Department of Forestry and Conservation Sciences, University of British Columbia , Vancouver, BC V6T 1Z4, Canada"}]},{"given":"Yue","family":"Li","sequence":"additional","affiliation":[{"name":"School of Computer Science , McGill, Montreal, QC H3A 0E9, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9555-860X","authenticated-orcid":false,"given":"Mathieu","family":"Blanchette","sequence":"additional","affiliation":[{"name":"School of Computer Science , McGill, Montreal, QC H3A 0E9, Canada"}]}],"member":"286","published-online":{"date-parts":[[2020,12,29]]},"reference":[{"year":"2015","author":"Abadi","key":"2023062409322489100_btaa842-B1"},{"key":"2023062409322489100_btaa842-B2","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning","volume":"33","author":"Alipanahi","year":"2015","journal-title":"Nat. Biotechnol"},{"first-page":"1","year":"2012","author":"Alippi","key":"2023062409322489100_btaa842-B3"},{"key":"2023062409322489100_btaa842-B7","doi-asserted-by":"crossref","first-page":"754","DOI":"10.1186\/s12864-016-2889-6","article-title":"Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons","volume":"17","author":"Drouin","year":"2016","journal-title":"BMC Genomics"},{"key":"2023062409322489100_btaa842-B8","doi-asserted-by":"crossref","first-page":"16041","DOI":"10.1038\/nmicrobiol.2016.41","article-title":"Identifying lineage effects when controlling for population structure improves power in bacterial association studies","volume":"1","author":"Earle","year":"2016","journal-title":"Nat. Microbiol"},{"key":"2023062409322489100_btaa842-B10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1086\/284325","article-title":"Phylogenies and the comparative method","volume":"125","author":"Felsenstein","year":"1985","journal-title":"Am. Natural"},{"key":"2023062409322489100_btaa842-B12","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1111\/eva.12853","article-title":"Genomic biosurveillance of forest invasive alien enemies: a story written in code","volume":"13","author":"Hamelin","year":"2020","journal-title":"Evol. Appl"},{"year":"2014","author":"Kingma","key":"2023062409322489100_btaa842-B14"},{"key":"2023062409322489100_btaa842-B15","doi-asserted-by":"crossref","first-page":"D26","DOI":"10.1093\/nar\/gkt1069","article-title":"The genome portal of the Department of Energy Joint Genome Institute: 2014 updates","volume":"42","author":"Nordberg","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023062409322489100_btaa842-B16","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1104\/pp.110.161315","article-title":"Gene clusters for secondary metabolic pathways: an emerging theme in plant biology","volume":"154","author":"Osbourn","year":"2010","journal-title":"Plant Physiol"},{"year":"2017","author":"Paszke","key":"2023062409322489100_btaa842-B17"},{"first-page":"1","year":"2014","author":"Raza","key":"2023062409322489100_btaa842-B18"},{"key":"2023062409322489100_btaa842-B19","doi-asserted-by":"crossref","first-page":"e1007309","DOI":"10.1371\/journal.pgen.1007309","article-title":"Population structure in genetic studies: confounding factors and mixed models","volume":"14","author":"Sul","year":"2018","journal-title":"PLoS Genet"},{"key":"2023062409322489100_btaa842-B20","first-page":"1305","article-title":"Active transfer learning under model shift","volume":"32","author":"Wang","year":"2014","journal-title":"Proc. Mach. Learn. Res"},{"key":"2023062409322489100_btaa842-B22","doi-asserted-by":"crossref","first-page":"D535","DOI":"10.1093\/nar\/gkw1017","article-title":"Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center","volume":"45","author":"Wattam","year":"2017","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/Supplement_2\/i895\/50693312\/btaa842.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/Supplement_2\/i895\/50693312\/btaa842.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,24]],"date-time":"2023-06-24T09:32:54Z","timestamp":1687599174000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/Supplement_2\/i895\/6055926"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12]]},"references-count":15,"journal-issue":{"issue":"Supplement_2","published-print":{"date-parts":[[2020,12,30]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa842","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2020,12]]},"published":{"date-parts":[[2020,12]]}}}