{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T22:40:49Z","timestamp":1776292849595,"version":"3.50.1"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2023,4,26]],"date-time":"2023-04-26T00:00:00Z","timestamp":1682467200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"U.S. National Science Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,5,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The human microbiome, which is linked to various diseases by growing evidence, has a profound impact on human health. Since changes in the composition of the microbiome across time are associated with disease and clinical outcomes, microbiome analysis should be performed in a longitudinal study. However, due to limited sample sizes and differing numbers of timepoints for different subjects, a significant amount of data cannot be utilized, directly affecting the quality of analysis results. Deep generative models have been proposed to address this lack of data issue. Specifically, a generative adversarial network (GAN) has been successfully utilized for data augmentation to improve prediction tasks. Recent studies have also shown improved performance of GAN-based models for missing value imputation in a multivariate time series dataset compared with traditional imputation methods.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>This work proposes DeepMicroGen, a bidirectional recurrent neural network-based GAN model, trained on the temporal relationship between the observations, to impute the missing microbiome samples in longitudinal studies. DeepMicroGen outperforms standard baseline imputation methods, showing the lowest mean absolute error for both simulated and real datasets. Finally, the proposed model improved the predicted clinical outcome for allergies, by providing imputation for an incomplete longitudinal dataset used to train the classifier.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>DeepMicroGen is publicly available at https:\/\/github.com\/joungmin-choi\/DeepMicroGen.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad286","type":"journal-article","created":{"date-parts":[[2023,4,26]],"date-time":"2023-04-26T18:56:10Z","timestamp":1682535370000},"source":"Crossref","is-referenced-by-count":34,"title":["DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation"],"prefix":"10.1093","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2090-3330","authenticated-orcid":false,"given":"Joung Min","family":"Choi","sequence":"first","affiliation":[{"name":"Department of Computer Science, Virginia Tech , Blacksburg, VA 24060, United States"}]},{"given":"Ming","family":"Ji","sequence":"additional","affiliation":[{"name":"College of Nursing, University of South Florida , Tampa, FL 33620, United States"}]},{"given":"Layne T","family":"Watson","sequence":"additional","affiliation":[{"name":"Departments of Computer Science, Mathematics, and Aerospace and Ocean Engineering, Virginia Tech , Blacksburg, VA 24060, United States"}]},{"given":"Liqing","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Virginia Tech , Blacksburg, VA 24060, United States"}]}],"member":"286","published-online":{"date-parts":[[2023,4,26]]},"reference":[{"key":"2023051907504217100_btad286-B1","first-page":"1","article-title":"Microbiome definition re-visited: old concepts and new challenges","volume":"8","author":"Berg","year":"2020","journal-title":"Microbiome"},{"key":"2023051907504217100_btad286-B2","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1038\/nmeth.f.303","article-title":"Qiime allows analysis of high-throughput community sequencing data","volume":"7","author":"Caporaso","year":"2010","journal-title":"Nat Methods"},{"key":"2023051907504217100_btad286-B3","doi-asserted-by":"crossref","first-page":"e1140","DOI":"10.7717\/peerj.1140","article-title":"Composition, taxonomy and functional diversity of the oropharynx microbiome in individuals with schizophrenia and controls","volume":"3","author":"Castro-Nallar","year":"2015","journal-title":"PeerJ"},{"key":"2023051907504217100_btad286-B4","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1109\/MSP.2017.2765202","article-title":"Generative adversarial networks: an overview","volume":"35","author":"Creswell","year":"2018","journal-title":"IEEE Signal Process Mag"},{"key":"2023051907504217100_btad286-B5","doi-asserted-by":"crossref","first-page":"5069","DOI":"10.1128\/AEM.03006-05","article-title":"Greengenes, a chimera-checked 16s rRNA gene database and workbench compatible with arb","volume":"72","author":"DeSantis","year":"2006","journal-title":"Appl Environ Microbiol"},{"key":"2023051907504217100_btad286-B6","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1016\/j.mib.2015.04.004","article-title":"Metagenomics meets time series analysis: unraveling microbial community dynamics","volume":"25","author":"Faust","year":"2015","journal-title":"Curr Opin Microbiol"},{"key":"2023051907504217100_btad286-B7","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1016\/j.chom.2015.04.007","article-title":"Microbiota in allergy and asthma and the emerging relationship with the gut microbiome","volume":"17","author":"Fujimura","year":"2015","journal-title":"Cell Host Microbe"},{"key":"2023051907504217100_btad286-B8","first-page":"647","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Gao","year":"2021"},{"key":"2023051907504217100_btad286-B9","doi-asserted-by":"crossref","first-page":"2224","DOI":"10.3389\/fmicb.2017.02224","article-title":"Microbiome datasets are compositional: and this is not optional","volume":"8","author":"Gloor","year":"2017","journal-title":"Front Microbiol"},{"key":"2023051907504217100_btad286-B10","volume-title":"Handbook of psychology: Research methods in psychology","author":"Graham","year":"2013"},{"key":"2023051907504217100_btad286-B11","author":"Gupta","year":"2020"},{"key":"2023051907504217100_btad286-B12","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1038\/s41591-019-0714-x","article-title":"Fecal dysbiosis in infants with cystic fibrosis is associated with early linear growth failure","volume":"26","author":"Hayden","year":"2020","journal-title":"Nat Med"},{"key":"2023051907504217100_btad286-B13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s11749-009-0138-x","article-title":"Missing data methods in longitudinal studies: a review","volume":"18","author":"Ibrahim","year":"2009","journal-title":"Test (Madr)"},{"key":"2023051907504217100_btad286-B14","first-page":"168","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Jung","year":"2019"},{"key":"2023051907504217100_btad286-B15","author":"Kingma","year":"2014"},{"key":"2023051907504217100_btad286-B16","doi-asserted-by":"crossref","first-page":"1489","DOI":"10.1053\/j.gastro.2014.02.009","article-title":"The microbiome in inflammatory bowel disease: current status and the future ahead","volume":"146","author":"Kostic","year":"2014","journal-title":"Gastroenterology"},{"key":"2023051907504217100_btad286-B17","volume-title":"Methods and Applications of Longitudinal Data Analysis","author":"Liu","year":"2015"},{"key":"2023051907504217100_btad286-B18","first-page":"1","article-title":"Multivariate time series imputation with generative adversarial networks","volume":"31","author":"Luo","year":"2018","journal-title":"Advances Neural Inf Process Syst"},{"key":"2023051907504217100_btad286-B19","first-page":"3094","volume-title":"Proceedings of the 28th International Joint Conference on Artificial Intelligence","author":"Luo","year":"2019"},{"key":"2023051907504217100_btad286-B20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-020-18871-1","article-title":"Health and disease markers correlate with gut microbiome composition across thousands of people","volume":"11","author":"Manor","year":"2020","journal-title":"Nat Commun"},{"key":"2023051907504217100_btad286-B21","doi-asserted-by":"crossref","first-page":"e20447","DOI":"10.1371\/journal.pone.0020447","article-title":"Towards the human colorectal cancer microbiome","volume":"6","author":"Marchesi","year":"2011","journal-title":"PLoS ONE"},{"key":"2023051907504217100_btad286-B22","doi-asserted-by":"crossref","first-page":"113696","DOI":"10.1016\/j.eswa.2020.113696","article-title":"Improving classification accuracy using data augmentation on small data sets","volume":"161","author":"Moreno-Barea","year":"2020","journal-title":"Expert Syst Appl"},{"key":"2023051907504217100_btad286-B23","author":"Oh","year":"2021"},{"key":"2023051907504217100_btad286-B24","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"2023051907504217100_btad286-B25","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1038\/nature11450","article-title":"A metagenome-wide association study of gut microbiota in type 2 diabetes","volume":"490","author":"Qin","year":"2012","journal-title":"Nature"},{"key":"2023051907504217100_btad286-B26","doi-asserted-by":"crossref","first-page":"2526","DOI":"10.1038\/ismej.2017.107","article-title":"Modeling time-series data from microbial communities","volume":"11","author":"Ridenhour","year":"2017","journal-title":"ISME J"},{"key":"2023051907504217100_btad286-B27","doi-asserted-by":"crossref","first-page":"giab005","DOI":"10.1093\/gigascience\/giab005","article-title":"MB-GAN: microbiome simulation via generative adversarial network","volume":"10","author":"Rong","year":"2021","journal-title":"GigaScience"},{"key":"2023051907504217100_btad286-B28","doi-asserted-by":"crossref","first-page":"3707","DOI":"10.1093\/bioinformatics\/btab482","article-title":"phylostm: a novel deep learning model on disease prediction from longitudinal microbiome data","volume":"37","author":"Sharma","year":"2021","journal-title":"Bioinformatics"},{"key":"2023051907504217100_btad286-B29","doi-asserted-by":"crossref","first-page":"4544","DOI":"10.1093\/bioinformatics\/btaa542","article-title":"Taxonn: ensemble of neural networks on stratified microbiome data for disease prediction","volume":"36","author":"Sharma","year":"2020","journal-title":"Bioinformatics"},{"key":"2023051907504217100_btad286-B30","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40168-017-0295-1","article-title":"Longitudinal development of the gut microbiome and metabolome in preterm neonates with late onset sepsis and healthy controls","volume":"5","author":"Stewart","year":"2017","journal-title":"Microbiome"},{"key":"2023051907504217100_btad286-B31","doi-asserted-by":"crossref","first-page":"902","DOI":"10.1038\/nmeth.3589","article-title":"Metaphlan2 for enhanced metagenomic taxonomic profiling","volume":"12","author":"Truong","year":"2015","journal-title":"Nat Methods"},{"key":"2023051907504217100_btad286-B32","first-page":"1","article-title":"Mice: multivariate imputation by chained equations in r","volume":"45","author":"van Buuren","year":"2011","journal-title":"J Stat Soft"},{"key":"2023051907504217100_btad286-B33","doi-asserted-by":"crossref","first-page":"842","DOI":"10.1016\/j.cell.2016.04.007","article-title":"Variation in microbiome LPS immunogenicity contributes to autoimmunity in humans","volume":"165","author":"Vatanen","year":"2016","journal-title":"Cell"},{"key":"2023051907504217100_btad286-B34","doi-asserted-by":"crossref","first-page":"bbaa073","DOI":"10.1093\/bib\/bbaa073","article-title":"A novel deep learning method for predictive modeling of microbiome data","volume":"22","author":"Wang","year":"2021","journal-title":"Brief Bioinf"},{"key":"2023051907504217100_btad286-B35","doi-asserted-by":"crossref","first-page":"103576","DOI":"10.1016\/j.jbi.2020.103576","article-title":"A deep learning-based, unsupervised method to impute missing values in electronic health records for improved patient management","volume":"111","author":"Xu","year":"2020","journal-title":"J Biomed Inform"},{"key":"2023051907504217100_btad286-B36","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1016\/j.ins.2020.11.035","article-title":"Missing value imputation in multivariate time series with end-to-end generative adversarial networks","volume":"551","author":"Zhang","year":"2021","journal-title":"Inf Sci"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad286\/50101071\/btad286.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/5\/btad286\/50394857\/btad286.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/5\/btad286\/50394857\/btad286.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,19]],"date-time":"2023-05-19T07:51:21Z","timestamp":1684482681000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad286\/7143379"}},"subtitle":[],"editor":[{"given":"Valentina","family":"Boeva","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,4,26]]},"references-count":36,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,5,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad286","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,5,1]]},"published":{"date-parts":[[2023,4,26]]},"article-number":"btad286"}}