{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:24Z","timestamp":1772138064010,"version":"3.50.1"},"reference-count":34,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2023,6,16]],"date-time":"2023-06-16T00:00:00Z","timestamp":1686873600000},"content-version":"vor","delay-in-days":15,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"IPMB, University Heidelberg"},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Variational autoencoders (VAEs) have rapidly increased in popularity in biological applications and have already successfully been used on many omic datasets. Their latent space provides a low-dimensional representation of input data, and VAEs have been applied, e.g. for clustering of single-cell transcriptomic data. However, due to their non-linear nature, the patterns that VAEs learn in the latent space remain obscure. Hence, the lower-dimensional data embedding cannot directly be related to input features.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>To shed light on the inner workings of VAE and enable direct interpretability of the model through its structure, we designed a novel VAE, OntoVAE (Ontology guided VAE) that can incorporate any ontology in its latent space and decoder part and, thus, provide pathway or phenotype activities for the ontology terms. In this work, we demonstrate that OntoVAE can be applied in the context of predictive modeling and show its ability to predict the effects of genetic or drug-induced perturbations using different ontologies and both, bulk and single-cell transcriptomic datasets. Finally, we provide a flexible framework, which can be easily adapted to any ontology and dataset.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>OntoVAE is available as a python package under https:\/\/github.com\/hdsu-bioquant\/onto-vae.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad387","type":"journal-article","created":{"date-parts":[[2023,6,15]],"date-time":"2023-06-15T00:52:00Z","timestamp":1686790320000},"source":"Crossref","is-referenced-by-count":43,"title":["Biologically informed variational autoencoders allow predictive modeling of genetic and drug-induced perturbations"],"prefix":"10.1093","volume":"39","author":[{"given":"Daria","family":"Doncevic","sequence":"first","affiliation":[{"name":"Health Data Science Unit and BioQuant, Medical Faculty Heidelberg , Im Neuenheimer Feld 267 , 69120 Heidelberg, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4989-4722","authenticated-orcid":false,"given":"Carl","family":"Herrmann","sequence":"additional","affiliation":[{"name":"Health Data Science Unit and BioQuant, Medical Faculty Heidelberg , Im Neuenheimer Feld 267 , 69120 Heidelberg, Germany"}]}],"member":"286","published-online":{"date-parts":[[2023,6,16]]},"reference":[{"key":"2023062809444581900_btad387-B1","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat Genet"},{"key":"2023062809444581900_btad387-B2","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1186\/s12859-021-04370-7","article-title":"Deep GONet: self-explainable deep neural network based on gene ontology for phenotype prediction from gene expression data","volume":"22","author":"Bourgeais","year":"2021","journal-title":"BMC Bioinformatics"},{"key":"2023062809444581900_btad387-B3","doi-asserted-by":"crossref","first-page":"2504","DOI":"10.1093\/bioinformatics\/btac147","article-title":"GraphGONet: a self-explaining neural network encapsulating the gene ontology graph for phenotype prediction on gene expression","volume":"38","author":"Bourgeais","year":"2022","journal-title":"Bioinformatics"},{"key":"2023062809444581900_btad387-B4","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1089\/bio.2015.0032","article-title":"A novel approach to high-quality postmortem tissue procurement: the GTEx project","volume":"13","author":"Carithers","year":"2015","journal-title":"Biopreserv Biobank"},{"key":"2023062809444581900_btad387-B5","doi-asserted-by":"crossref","first-page":"1508","DOI":"10.3390\/cells11091508","article-title":"Unraveling the molecular basis of the dystrophic process in limb-girdle muscular dystrophy LGMD-R12 by differential gene expression profiles in diseased and healthy muscles","volume":"11","author":"Depuydt","year":"2022","journal-title":"Cells"},{"key":"2023062809444581900_btad387-B6","doi-asserted-by":"crossref","first-page":"856","DOI":"10.1038\/s41467-020-14666-6","article-title":"Deriving disease modules from the compressed transcriptional space embedded in a deep autoencoder","volume":"11","author":"Dwivedi","year":"2020","journal-title":"Nat Commun"},{"key":"2023062809444581900_btad387-B7","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1038\/s41467-018-07931-2","article-title":"Single-cell RNA-seq denoising using a deep count autoencoder","volume":"10","author":"Eraslan","year":"2019","journal-title":"Nat Commun"},{"key":"2023062809444581900_btad387-B8","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1186\/s13059-020-02100-5","article-title":"Knowledge-primed neural networks enable biologically interpretable deep learning on single-cell sequencing data","volume":"21","author":"Fortelny","year":"2020","journal-title":"Genome Biol"},{"key":"2023062809444581900_btad387-B9","doi-asserted-by":"crossref","first-page":"D258","DOI":"10.1093\/nar\/gkh036","article-title":"The Gene Ontology (GO) database and informatics resource","volume":"32","author":"Harris","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023062809444581900_btad387-B10","first-page":"26711","author":"Hetzel"},{"key":"2023062809444581900_btad387-B11","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1126\/science.1127647","article-title":"Reducing the dimensionality of data with neural networks","volume":"313","author":"Hinton","year":"2006","journal-title":"Science"},{"key":"2023062809444581900_btad387-B12","author":"Huang"},{"key":"2023062809444581900_btad387-B13","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1038\/nbt.4042","article-title":"Multiplexed droplet single-cell RNA-sequencing using natural genetic variation","volume":"36","author":"Kang","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2023062809444581900_btad387-B14","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1561\/2200000056","article-title":"An introduction to variational autoencoders","volume":"12","author":"Kingma","year":"2019","journal-title":"FNT in Machine Learning"},{"key":"2023062809444581900_btad387-B15","author":"Kingma","year":"2013"},{"key":"2023062809444581900_btad387-B16","author":"Li","year":"2022"},{"key":"2023062809444581900_btad387-B17","doi-asserted-by":"crossref","first-page":"1053","DOI":"10.1038\/s41592-018-0229-2","article-title":"Deep generative modeling for single-cell transcriptomics","volume":"15","author":"Lopez","year":"2018","journal-title":"Nat Methods"},{"key":"2023062809444581900_btad387-B18","first-page":"337","article-title":"Biologically informed deep learning to query gene programs in single-cell atlases","volume":"25","author":"Lotfollahi","year":"2023","journal-title":"Nat Cell Biol"},{"key":"2023062809444581900_btad387-B19","doi-asserted-by":"crossref","first-page":"715","DOI":"10.1038\/s41592-019-0494-8","article-title":"scGen predicts single-cell perturbation responses","volume":"16","author":"Lotfollahi","year":"2019","journal-title":"Nat Methods"},{"key":"2023062809444581900_btad387-B20","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1038\/s41540-022-00218-9","article-title":"Deep neural network prediction of genome-wide transcriptome signatures - beyond the black-box","volume":"8","author":"Magnusson","year":"2022","journal-title":"NPJ Syst Biol Appl"},{"key":"2023062809444581900_btad387-B21","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1038\/nmeth.4627","article-title":"Using deep learning to model the hierarchical structure and function of a cell","volume":"15","author":"Ma","year":"2018","journal-title":"Nat Methods"},{"key":"2023062809444581900_btad387-B22","doi-asserted-by":"crossref","first-page":"872","DOI":"10.1038\/sj.embor.7400221","article-title":"Duchenne muscular dystrophy and dystrophin: pathogenesis and opportunities for treatment","volume":"5","author":"Nowak","year":"2004","journal-title":"EMBO Rep"},{"key":"2023062809444581900_btad387-B23","doi-asserted-by":"crossref","first-page":"284","DOI":"10.1186\/s12859-019-2769-6","article-title":"Combining gene ontology with deep neural networks to enhance the clustering of single cell RNA-Seq data","volume":"20","author":"Peng","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023062809444581900_btad387-B24","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1016\/j.ajhg.2008.09.017","article-title":"The human phenotype ontology: a tool for annotating and analyzing human hereditary disease","volume":"83","author":"Robinson","year":"2008","journal-title":"Am J Hum Genet"},{"key":"2023062809444581900_btad387-B25","author":"Rybakov","year":"2020"},{"key":"2023062809444581900_btad387-B26","doi-asserted-by":"crossref","first-page":"5684","DOI":"10.1038\/s41467-021-26017-0","article-title":"VEGA is an interpretable generative model for inferring biological network activity in single-cell transcriptomics","volume":"12","author":"Seninge","year":"2021","journal-title":"Nat Commun"},{"key":"2023062809444581900_btad387-B27","doi-asserted-by":"crossref","first-page":"3418","DOI":"10.1093\/bioinformatics\/btaa169","article-title":"Interpretable factor models of single-cell RNA-seq via variational autoencoders","volume":"36","author":"Svensson","year":"2020","journal-title":"Bioinformatics"},{"key":"2023062809444581900_btad387-B28","doi-asserted-by":"crossref","first-page":"1029","DOI":"10.1038\/s41467-021-21312-2","article-title":"Fast and precise single-cell data analysis using a hierarchical autoencoder","volume":"12","author":"Tran","year":"2021","journal-title":"Nat Commun"},{"key":"2023062809444581900_btad387-B29","doi-asserted-by":"crossref","first-page":"875","DOI":"10.1038\/s41592-019-0537-1","article-title":"Data denoising with transfer learning in single-cell transcriptomics","volume":"16","author":"Wang","year":"2019","journal-title":"Nat Methods"},{"key":"2023062809444581900_btad387-B30","doi-asserted-by":"crossref","first-page":"1274","DOI":"10.1093\/bioinformatics\/btm087","article-title":"A new method to measure the semantic similarity of GO terms","volume":"23","author":"Wang","year":"2007","journal-title":"Bioinformatics"},{"key":"2023062809444581900_btad387-B31","first-page":"80","article-title":"Extracting a biologically relevant latent space from cancer transcriptomes with variational autoencoders","volume":"23","author":"Way","year":"2018","journal-title":"Pac Symp Biocomput"},{"key":"2023062809444581900_btad387-B32","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1186\/s13059-021-02533-6","article-title":"recount3: summaries and queries for large-scale RNA-seq expression and splicing","volume":"22","author":"Wilks","year":"2021","journal-title":"Genome Biol"},{"key":"2023062809444581900_btad387-B33","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1038\/s41467-020-20249-2","article-title":"Multi-domain translation between single-cell imaging and sequencing data using autoencoders","volume":"12","author":"Yang","year":"2021","journal-title":"Nat Commun"},{"key":"2023062809444581900_btad387-B34","author":"Zhang"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad387\/50628658\/btad387.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/6\/btad387\/50729813\/btad387.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/6\/btad387\/50729813\/btad387.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,28]],"date-time":"2023-06-28T05:45:26Z","timestamp":1687931126000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad387\/7199588"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,6,1]]},"references-count":34,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad387","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.09.20.508703","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,6,1]]},"published":{"date-parts":[[2023,6,1]]},"article-number":"btad387"}}