{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T01:46:21Z","timestamp":1760233581929,"version":"build-2065373602"},"reference-count":46,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2021,2,2]],"date-time":"2021-02-02T00:00:00Z","timestamp":1612224000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004410","name":"T\u00fcrkiye Bilimsel ve Teknolojik Ara\u015ftirma Kurumu","doi-asserted-by":"publisher","award":["120E092"],"award-info":[{"award-number":["120E092"]}],"id":[{"id":"10.13039\/501100004410","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Quantitative metagenomics is an important field that has delivered successful microbiome biomarkers associated with host phenotypes. The current convention mainly depends on unsupervised assembly of metagenomic contigs with a possibility of leaving interesting genetic material unassembled. Additionally, biomarkers are commonly defined on the differential relative abundance of compositional or functional units. Accumulating evidence supports that microbial genetic variations are as important as the differential abundance content, implying the need for novel methods accounting for the genetic variations in metagenomics studies. We propose an information theoretic metagenome assembly algorithm, discovering genomic fragments with maximal self-information, defined by the empirical distributions of nucleotides across the phenotypes and quantified with the help of statistical tests. Our algorithm infers fragments populating the most informative genetic variants in a single contig, named supervariant fragments. Experiments on simulated metagenomes, as well as on a colorectal cancer and an atherosclerotic cardiovascular disease dataset consistently discovered sequences strongly associated with the disease phenotypes. Moreover, the discriminatory power of these putative biomarkers was mainly attributed to the genetic variations rather than relative abundance. Our results support that a focus on metagenomics methods considering microbiome population genetics might be useful in discovering disease biomarkers with a great potential of translating to molecular diagnostics and biotherapeutics applications.<\/jats:p>","DOI":"10.3390\/e23020187","type":"journal-article","created":{"date-parts":[[2021,2,2]],"date-time":"2021-02-02T13:01:12Z","timestamp":1612270872000},"page":"187","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2278-7786","authenticated-orcid":false,"given":"O. Ufuk","family":"Nalbantoglu","sequence":"first","affiliation":[{"name":"Department of Computer Engineering, Erciyes University, 38039 Kayseri, Turkey"},{"name":"Genome and Stem Cell Center, Erciyes University, 38039 Kayseri, Turkey"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,2,2]]},"reference":[{"key":"ref_1","first-page":"184","article-title":"Metagenomic analysis and its applications","volume":"3","author":"Ghosh","year":"2019","journal-title":"Encycl. Bioinform. Comput Biol"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"655","DOI":"10.1038\/s41586-019-1237-9","article-title":"Multi-omics of the gut microbial ecosystem ininflammatory bowel diseases","volume":"569","author":"Lloyd","year":"2019","journal-title":"Nature"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-017-00900-1","article-title":"The gutmicrobiome in atherosclerotic cardiovascular disease","volume":"8","author":"Jie","year":"2017","journal-title":"Nat. Commun."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1038\/nature11450","article-title":"Ametagenome-wide association study of gut microbiota in type 2 diabetes","volume":"490","author":"Qin","year":"2012","journal-title":"Nature"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1186\/s40168-018-0451-2","article-title":"Multi-cohort analysis of colorectal cancer metagenome identified altered bacteriaacross populations and universal bacterial markers","volume":"6","author":"Dai","year":"2018","journal-title":"Microbiome"},{"key":"ref_6","first-page":"4095789","article-title":"The gut microbiomeprofile in obesity: A systematic review","volume":"2018","author":"Castaner","year":"2018","journal-title":"Int. J. Endocrinol."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1136\/annrheumdis-2019-215743","article-title":"Metagenome-wide association study of gut microbiome novel aetiology of rheumatoid arthritis in the Japanese population","volume":"79","author":"Kishikawa","year":"2020","journal-title":"Ann. Rheum. Dis."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nature13568","article-title":"Alterations of the human gut microbiome in liver cirrhosis","volume":"513","author":"Qin","year":"2014","journal-title":"Nature"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"518","DOI":"10.3389\/fped.2019.00518","article-title":"Metagenome of gut microbiota of children withnonalcoholic fatty liver disease Short title: Microbiome analysis of NAFLD children","volume":"7","author":"Zhao","year":"2019","journal-title":"Front. Pediatrics"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12866-018-1257-x","article-title":"Ametagenome-wide association study of gut microbiota in asthma in UK adults","volume":"18","author":"Wang","year":"2018","journal-title":"BMC Microbiol."},{"key":"ref_11","first-page":"1","article-title":"Functional implications of microbial and viral gut metagenome changes in early stage L-DOPA-na\u00efve Parkinson\u2019s disease patients","volume":"9","author":"Bedarf","year":"2017","journal-title":"Genome Med."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"633","DOI":"10.1016\/j.bbi.2019.05.008","article-title":"Altered microbiomes distinguish Alzheimer\u2019s disease from amnestic mild cognitive impairment and health in a Chinese cohort","volume":"80","author":"Liu","year":"2019","journal-title":"Brain Behav. Immun."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"558","DOI":"10.1099\/jmm.0.001178","article-title":"The bacterial neurometabolic signature of the gut microbiota of young children with autism spectrum disorders","volume":"69","author":"Averina","year":"2020","journal-title":"J. Med. Microbiol."},{"key":"ref_14","first-page":"1","article-title":"Metagenome-wide association of gut microbiome features for schizophrenia","volume":"11","author":"Zhu","year":"2020","journal-title":"Nat. Commun."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"567","DOI":"10.1038\/s41586-020-2095-1","article-title":"Microbiome analyses of blood and tissues suggest cancer diagnostic approach","volume":"579","author":"Poore","year":"2020","journal-title":"Nature"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1186\/s40168-019-0767-6","article-title":"Advancing functional and translational microbiome research using meta-omics approaches","volume":"7","author":"Zhang","year":"2019","journal-title":"Microbiome"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1186\/s40169-019-0232-y","article-title":"The significance of microbiome in personalized medicine","volume":"8","author":"Behrouzi","year":"2019","journal-title":"Clin. Transl. Med."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1038\/nrg3367","article-title":"Sequence assembly demystified","volume":"14","author":"Nagarajan","year":"2013","journal-title":"Nat. Rev. Genet."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"e31386","DOI":"10.1371\/journal.pone.0031386","article-title":"Assessment of metagenomic assembly using simulated next generation sequencing data","volume":"7","author":"Mende","year":"2012","journal-title":"PLoS ONE"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"707","DOI":"10.1093\/gbe\/evy031","article-title":"Microbial dark matter investigations: How microbial studies transform biological knowledge and empirically sketch a logic of scientific discovery","volume":"10","author":"Bernard","year":"2018","journal-title":"Genome Biol. Evol."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"864","DOI":"10.1099\/acmi.ac2019.po0557","article-title":"Uncovering the dark matter of the metagenome one read at a time","volume":"1","author":"Dimonaco","year":"2019","journal-title":"Access Microbiol."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1038\/s41586-019-1065-y","article-title":"Structural variation in the gut microbiome associates with host health","volume":"568","author":"Zeevi","year":"2019","journal-title":"Nature"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40168-017-0232-3","article-title":"Gut metagenomes of type 2 diabetic patients have characteristic single-nucleotide polymorphism distribution in Bacteroides coprocola","volume":"5","author":"Chen","year":"2017","journal-title":"Microbiome"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1464","DOI":"10.1073\/pnas.1218080110","article-title":"Modulating the innate immune response by combinatorial engineering of endotoxin","volume":"110","author":"Needham","year":"2013","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1038\/nm.3145","article-title":"Intestinal microbiota metabolism of L-carnitine, a nutrient in red meat, promotes atherosclerosis","volume":"19","author":"Koeth","year":"2013","journal-title":"Nat. Med."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1038\/nrmicro.2016.17","article-title":"The microbial pharmacists within us: A metagenomic view of xenobiotic metabolism","volume":"14","author":"Spanogiannopoulos","year":"2016","journal-title":"Nat. Rev. Microbiol."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1016\/j.chom.2019.01.004","article-title":"Distinct genetic and functional traits of human intestinal Prevotella copri strains are associated with different habitual diets","volume":"25","author":"Filippis","year":"2019","journal-title":"Cell Host Microbe"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1186\/s13059-015-0804-0","article-title":"Subtractive assembly for comparative metagenomics, and its application to type 2 diabetes metagenomes","volume":"16","author":"Wang","year":"2015","journal-title":"Genome Biol."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Han, W., Wang, M., and Ye, Y. (2017). A concurrent subtractive assembly approach for identification of disease associated sub-metagenomes. Proceedings of the International Conference on Research in Computational Molecular Biology, Springer.","DOI":"10.1007\/978-3-319-56970-3_2"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1136\/gutjnl-2015-309800","article-title":"Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer","volume":"66","author":"Yu","year":"2017","journal-title":"Gut"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"e3373","DOI":"10.1371\/journal.pone.0003373","article-title":"MetaSim: A sequencing simulator for genomics and metagenomics","volume":"3","author":"Richter","year":"2008","journal-title":"PLoS ONE"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Hyatt, D., Chen, G.L., Cascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. (2010). Prodigal: Prokaryotic gene recognition and translation initiation site identification. BMC Bioinform., 11.","DOI":"10.1186\/1471-2105-11-119"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1674","DOI":"10.1093\/bioinformatics\/btv033","article-title":"MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph","volume":"31","author":"Li","year":"2015","journal-title":"Bioinformatics"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1420","DOI":"10.1093\/bioinformatics\/bts174","article-title":"IDBA-UD: A de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth","volume":"28","author":"Peng","year":"2012","journal-title":"Bioinformatics"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 3\u201317). Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1038\/nbt.2942","article-title":"An integrated catalog of reference genes in the human gut microbiome","volume":"32","author":"Li","year":"2014","journal-title":"Nat. Biotechnol."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1103","DOI":"10.1016\/j.cell.2016.08.007","article-title":"Toward accurate and quantitative comparative metagenomics","volume":"166","author":"Nayfach","year":"2016","journal-title":"Cell"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"649","DOI":"10.1016\/j.cell.2019.01.001","article-title":"Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle","volume":"176","author":"Pasolli","year":"2019","journal-title":"Cell"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/j.tig.2019.10.010","article-title":"Population genetics in the human microbiome","volume":"36","author":"Garud","year":"2020","journal-title":"Trends Genet."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"2759","DOI":"10.1093\/bioinformatics\/btx304","article-title":"KMC 3: Counting and manipulating k-mer statistics","volume":"33","author":"Kokot","year":"2017","journal-title":"Bioinformatics"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"9748","DOI":"10.1073\/pnas.171285098","article-title":"An Eulerian path approach to DNA fragment assembly","volume":"98","author":"Pevzner","year":"2001","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"987","DOI":"10.1038\/nbt.2023","article-title":"Why are de Bruijn graphs useful for genome assembly?","volume":"29","author":"Compeau","year":"2011","journal-title":"Nat. Biotechnol."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"584","DOI":"10.1093\/bib\/bbz020","article-title":"New approaches for metagenome assembly with short reads","volume":"21","author":"Ayling","year":"2020","journal-title":"Brief. Bioinform."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1109\/MSP.2006.1657823","article-title":"The viterbi algorithm: A personal history","volume":"23","author":"Forney","year":"2006","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"2115","DOI":"10.1093\/molbev\/msx148","article-title":"Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper","volume":"34","author":"Huerta","year":"2017","journal-title":"Mol. Biol. Evol."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/ncomms11257","article-title":"Fast and sensitive taxonomic classification for metagenomics with Kaiju","volume":"7","author":"Menzel","year":"2016","journal-title":"Nat. Commun."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/2\/187\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:19:07Z","timestamp":1760159947000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/2\/187"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,2]]},"references-count":46,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2021,2]]}},"alternative-id":["e23020187"],"URL":"https:\/\/doi.org\/10.3390\/e23020187","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2021,2,2]]}}}