{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T19:54:43Z","timestamp":1769630083775,"version":"3.49.0"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,1,19]],"date-time":"2021-01-19T00:00:00Z","timestamp":1611014400000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2021,1,19]],"date-time":"2021-01-19T00:00:00Z","timestamp":1611014400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["31671366, 32070667"],"award-info":[{"award-number":["31671366, 32070667"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BioData Mining"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>The diagnosis of inflammatory bowel disease (IBD) and discrimination between the types of IBD are clinically important. IBD is associated with marked changes in the intestinal microbiota. Advances in next-generation sequencing (NGS) technology and the improved hospital bioinformatics analysis ability motivated us to develop a diagnostic method based on the gut microbiome.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>Using a set of whole-genome sequencing (WGS) data from 349 human gut microbiota samples with two types of IBD and healthy controls, we assembled and aligned WGS short reads to obtain feature profiles of strains and genera. The genus and strain profiles were used for the 16S-based and WGS-based diagnostic modules construction respectively. We designed a novel feature selection procedure to select those case-specific features. With these features, we built discrimination models using different machine learning algorithms. The machine learning algorithm LightGBM outperformed other algorithms in this study and thus was chosen as the core algorithm. Specially, we identified two small sets of biomarkers (strains) separately for the WGS-based health vs IBD module and ulcerative colitis vs Crohn\u2019s disease module, which contributed to the optimization of model performance during pre-training.<\/jats:p>\n                <jats:p>We released LightCUD as an IBD diagnostic program built with LightGBM. The high performance has been validated through five-fold cross-validation and using an independent test data set. LightCUD was implemented in Python and packaged free for installation with customized databases. With WGS data or 16S rRNA sequencing data of gut microbiome samples as the input, LightCUD can discriminate IBD from healthy controls with high accuracy and further identify the specific type of IBD. The executable program LightCUD was released in open source with instructions at the webpage <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"http:\/\/cqb.pku.edu.cn\/ZhuLab\/LightCUD\/\">http:\/\/cqb.pku.edu.cn\/ZhuLab\/LightCUD\/<\/jats:ext-link>. The identified strain biomarkers could be used to study the critical factors for disease development and recommend treatments regarding changes in the gut microbial community.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>As the first released human gut microbiome-based IBD diagnostic tool, LightCUD demonstrates a high-performance for both WGS and 16S sequencing data. The strains that either identify healthy controls from IBD patients or distinguish the specific type of IBD are expected to be clinically important to serve as biomarkers.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s13040-021-00241-2","type":"journal-article","created":{"date-parts":[[2021,1,19]],"date-time":"2021-01-19T21:02:52Z","timestamp":1611090172000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["LightCUD: a program for diagnosing IBD based on human gut microbiome data"],"prefix":"10.1186","volume":"14","author":[{"given":"Congmin","family":"Xu","sequence":"first","affiliation":[]},{"given":"Man","family":"Zhou","sequence":"additional","affiliation":[]},{"given":"Zhongjie","family":"Xie","sequence":"additional","affiliation":[]},{"given":"Mo","family":"Li","sequence":"additional","affiliation":[]},{"given":"Xi","family":"Zhu","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6376-218X","authenticated-orcid":false,"given":"Huaiqiu","family":"Zhu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,1,19]]},"reference":[{"issue":"7422","key":"241_CR1","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1038\/nature11582","volume":"491","author":"L Jostins","year":"2012","unstructured":"Jostins L, Ripke S, Weersma RK, Duerr RH, McGovern DP, Hui KY, et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature. 2012;491(7422):119.","journal-title":"Nature."},{"issue":"42","key":"241_CR2","doi-asserted-by":"publisher","first-page":"1166","DOI":"10.15585\/mmwr.mm6542a3","volume":"65","author":"JM Dahlhamer","year":"2016","unstructured":"Dahlhamer JM, Zammitti EP, Ward BW, Wheaton AG, Croft JB. Prevalence of inflammatory bowel disease among adults aged \u226518 years - United States, 2015. MMWR Morb Mortal Wkly Rep. 2016;65(42):1166\u20139.","journal-title":"MMWR Morb Mortal Wkly Rep"},{"issue":"12","key":"241_CR3","doi-asserted-by":"publisher","first-page":"1772","DOI":"10.1111\/j.1440-1746.2006.04674.x","volume":"21","author":"Q Ouyang","year":"2006","unstructured":"Ouyang Q, Tandon R, Goh KL, Pan G-Z, Fock KM, Fiocchi C, et al. Management consensus of inflammatory bowel disease for the Asia? Pacific region. J Gastroenterol Hepatol. 2006;21(12):1772\u201382.","journal-title":"J Gastroenterol Hepatol"},{"issue":"8","key":"241_CR4","first-page":"123","volume":"106","author":"DC Baumgart","year":"2009","unstructured":"Baumgart DC. The diagnosis and treatment of crohn's disease and ulcerative colitis. Deutsches Aerzteblatt Int. 2009;106(8):123.","journal-title":"Deutsches Aerzteblatt Int"},{"issue":"3","key":"241_CR5","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1053\/j.sempedsurg.2007.04.002","volume":"16","author":"S Kugathasan","year":"2007","unstructured":"Kugathasan S, Fiocchi C. Progress in basic inflammatory bowel disease research. Semin Pediatr Surg. 2007;16(3):146\u201353.","journal-title":"Semin Pediatr Surg"},{"issue":"6","key":"241_CR6","doi-asserted-by":"publisher","first-page":"1817","DOI":"10.1053\/j.gastro.2010.11.058","volume":"140","author":"JD Lewis","year":"2011","unstructured":"Lewis JD. The utility of biomarkers in the diagnosis and therapy of inflammatory bowel disease. Gastroenterology. 2011;140(6):1817\u201326.","journal-title":"Gastroenterology."},{"issue":"4","key":"241_CR7","doi-asserted-by":"publisher","first-page":"506","DOI":"10.1136\/gut.47.4.506","volume":"47","author":"J Tibble","year":"2000","unstructured":"Tibble J. A simple method for assessing intestinal inflammation in Crohn's disease. Gut. 2000;47(4):506\u201313.","journal-title":"Gut."},{"issue":"7","key":"241_CR8","doi-asserted-by":"publisher","first-page":"688","DOI":"10.1038\/sj.embor.7400731","volume":"7","author":"AM O'Hara","year":"2006","unstructured":"O'Hara AM, Shanahan F. The gut flora as a forgotten organ. EMBO Rep. 2006;7(7):688\u201393.","journal-title":"EMBO Rep"},{"issue":"3","key":"241_CR9","doi-asserted-by":"publisher","first-page":"382","DOI":"10.1016\/j.chom.2014.02.005","volume":"15","author":"D Gevers","year":"2014","unstructured":"Gevers D, Kugathasan S, Denson Lee A, V\u00e1zquez-Baeza Y, Van Treuren W, Ren B, et al. The treatment-naive microbiome in new-onset Crohn\u2019s disease. Cell Host Microbe. 2014;15(3):382\u201392.","journal-title":"Cell Host Microbe"},{"issue":"3","key":"241_CR10","doi-asserted-by":"publisher","first-page":"481","DOI":"10.1097\/MIB.0b013e31827fec6d","volume":"19","author":"M Rajili\u0107-Stojanovi\u0107","year":"2013","unstructured":"Rajili\u0107-Stojanovi\u0107 M, Shanahan F, Guarner F, de Vos WM. Phylogenetic analysis of dysbiosis in ulcerative colitis during remission. Inflamm Bowel Dis. 2013;19(3):481\u20138.","journal-title":"Inflamm Bowel Dis"},{"issue":"6","key":"241_CR11","doi-asserted-by":"publisher","first-page":"1844","DOI":"10.1053\/j.gastro.2010.08.049","volume":"139","author":"BP Willing","year":"2010","unstructured":"Willing BP, Dicksved J, Halfvarson J, Andersson AF, Lucio M, Zheng Z, et al. A pyrosequencing study in twins shows that gastrointestinal microbial profiles vary with inflammatory bowel disease phenotypes. Gastroenterology. 2010;139(6):1844\u201354.","journal-title":"Gastroenterology."},{"issue":"3","key":"241_CR12","doi-asserted-by":"publisher","first-page":"693","DOI":"10.1111\/j.1467-985X.2010.00646_6.x","volume":"173","author":"LZ John","year":"2010","unstructured":"John LZ. The elements of statistical learning: data mining, inference, and prediction. J Roy Stat Soc A Sta. 2010;173(3):693\u20134.","journal-title":"J Roy Stat Soc A Sta"},{"issue":"4","key":"241_CR13","doi-asserted-by":"publisher","first-page":"049901","DOI":"10.1117\/1.2819119","volume":"16","author":"NM Nasrabadi","year":"2007","unstructured":"Nasrabadi NM. Pattern recognition and machine learning. J Electron Imaging. 2007;16(4):049901.","journal-title":"J Electron Imaging"},{"key":"241_CR14","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781107298019","volume-title":"Understanding machine learning: from theory to algorithms: Cambridge university press","author":"S Shalev-Shwartz","year":"2014","unstructured":"Shalev-Shwartz S, Ben-David S. Understanding machine learning: from theory to algorithms: Cambridge university press; 2014."},{"key":"241_CR15","first-page":"3146","volume-title":"LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems","author":"G Ke","year":"2017","unstructured":"Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems; 2017. p. 3146\u201354."},{"issue":"D1","key":"241_CR16","doi-asserted-by":"publisher","first-page":"D36","DOI":"10.1093\/nar\/gks1195","volume":"41","author":"DA Benson","year":"2012","unstructured":"Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, et al. GenBank. Nucleic Acids Res. 2012;41(D1):D36\u201342.","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"241_CR17","first-page":"D141","volume":"37","author":"JR Cole","year":"2008","unstructured":"Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, et al. The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res. 2008;37(suppl_1):D141\u2013D5.","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"241_CR18","doi-asserted-by":"publisher","first-page":"489","DOI":"10.1016\/j.chom.2015.09.008","volume":"18","author":"JD Lewis","year":"2015","unstructured":"Lewis JD, Chen EZ, Baldassano RN, Otley AR, Griffiths AM, Lee D, et al. Inflammation, antibiotics, and diet as environmental stressors of the gut microbiome in pediatric Crohn's disease. Cell Host Microbe. 2015;18(4):489\u2013500. https:\/\/doi.org\/10.1016\/j.chom.2015.09.008.","journal-title":"Cell Host Microbe"},{"issue":"37","key":"241_CR19","doi-asserted-by":"publisher","first-page":"5941","DOI":"10.3748\/wjg.v12.i37.5941","volume":"12","author":"JB Ewaschuk","year":"2006","unstructured":"Ewaschuk JB. Probiotics and prebiotics in chronic inflammatory bowel diseases. World J Gastroenterol. 2006;12(37):5941\u201325.","journal-title":"World J Gastroenterol"},{"issue":"5","key":"241_CR20","doi-asserted-by":"publisher","first-page":"288","DOI":"10.1016\/S1473-3099(06)70464-9","volume":"6","author":"V Sizaire","year":"2006","unstructured":"Sizaire V, Nackers F, Comte E, Portaels F. Mycobacterium ulcerans infection: control, diagnosis, and treatment. Lancet Infect Dis. 2006;6(5):288\u201396.","journal-title":"Lancet Infect Dis"},{"issue":"3","key":"241_CR21","doi-asserted-by":"publisher","first-page":"306","DOI":"10.1080\/13102818.2007.10817465","volume":"21","author":"M Stoyanova","year":"2007","unstructured":"Stoyanova M, Pavlina I, Moncheva P, Bogatzevska N. Biodiversity and incidence of Burkholderia species. Biotechnol Biotec Eq. 2007;21(3):306\u201310.","journal-title":"Biotechnol Biotec Eq"},{"issue":"1\u20132","key":"241_CR22","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1016\/0378-1135(91)90046-I","volume":"26","author":"P Brouqui","year":"1991","unstructured":"Brouqui P, Davoust B, Haddad S, Vidor E, Raoult D. Serological evaluation of Ehrlichia canis infections in military dogs in Africa and Reunion Island. Vet Microbiol. 1991;26(1\u20132):103\u20135.","journal-title":"Vet Microbiol"},{"key":"241_CR23","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1016\/j.micres.2015.03.010","volume":"174","author":"T Bennur","year":"2015","unstructured":"Bennur T, Kumar AR, Zinjarde S, Javdekar V. Nocardiopsis species: incidence, ecological roles and adaptations. Microbiol Res. 2015;174:33\u201347.","journal-title":"Microbiol Res"},{"issue":"2","key":"241_CR24","doi-asserted-by":"publisher","first-page":"e56685","DOI":"10.1371\/journal.pone.0056685","volume":"8","author":"D Nagy-Szakal","year":"2013","unstructured":"Nagy-Szakal D, Hollister EB, Luna RA, et al. Cellulose supplementation early in life ameliorates colitis in adult mice. PLoS One. 2013;8(2):e56685.","journal-title":"PLoS One"},{"issue":"8","key":"241_CR25","doi-asserted-by":"publisher","first-page":"822","DOI":"10.1038\/nbt.2939","volume":"32","author":"HB Nielsen","year":"2014","unstructured":"Nielsen HB, Almeida M, Juncker AS, Rasmussen S, Li J, Sunagawa S, et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat Biotechnol. 2014;32(8):822\u20138. https:\/\/doi.org\/10.1038\/nbt.2939.","journal-title":"Nat Biotechnol"},{"issue":"1","key":"241_CR26","doi-asserted-by":"publisher","first-page":"244","DOI":"10.1186\/s12859-015-0686-x","volume":"16","author":"B Lai","year":"2015","unstructured":"Lai B, Wang F, Wang X, Duan L, Zhu H. InteMAP: integrated metagenomic assembly pipeline for NGS short reads. BMC Bioinformatics. 2015;16(1):244.","journal-title":"BMC Bioinformatics"},{"issue":"10","key":"241_CR27","doi-asserted-by":"publisher","first-page":"e76185","DOI":"10.1371\/journal.pone.0076185","volume":"8","author":"F Guo","year":"2013","unstructured":"Guo F, Ju F, Cai L, et al. Taxonomic precision of different hypervariable regions of 16S rRNA gene and annotation methods for functional bacterial groups in biological wastewater treatment. PLoS One. 2013;8(10):e76185.","journal-title":"PLoS One"},{"issue":"9","key":"241_CR28","doi-asserted-by":"publisher","first-page":"673","DOI":"10.1038\/nmeth.1358","volume":"6","author":"A Brady","year":"2009","unstructured":"Brady A, Salzberg SL. Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated markov models. Nat Methods. 2009;6(9):673.","journal-title":"Nat Methods"},{"issue":"4","key":"241_CR29","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1038\/nmeth.1923","volume":"9","author":"B Langmead","year":"2012","unstructured":"Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat Methods. 2012;9(4):357\u20139.","journal-title":"Nat Methods"},{"key":"241_CR30","doi-asserted-by":"publisher","first-page":"80","DOI":"10.2307\/3001968","volume":"1","author":"F Wilcoxon","year":"1945","unstructured":"Wilcoxon F. Individual comparisons by ranking methods. Biometrics. 1945;1:80\u20133.","journal-title":"Biometrics."},{"issue":"9","key":"241_CR31","doi-asserted-by":"publisher","first-page":"e1002687","DOI":"10.1371\/journal.pcbi.1002687","volume":"8","author":"J Friedman","year":"2012","unstructured":"Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol. 2012;8(9):e1002687.","journal-title":"PLoS Comput Biol"}],"container-title":["BioData Mining"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13040-021-00241-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s13040-021-00241-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13040-021-00241-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,1,19]],"date-time":"2021-01-19T21:05:56Z","timestamp":1611090356000},"score":1,"resource":{"primary":{"URL":"https:\/\/biodatamining.biomedcentral.com\/articles\/10.1186\/s13040-021-00241-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,19]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["241"],"URL":"https:\/\/doi.org\/10.1186\/s13040-021-00241-2","relation":{},"ISSN":["1756-0381"],"issn-type":[{"value":"1756-0381","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,1,19]]},"assertion":[{"value":"6 July 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 January 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 January 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"2"}}