{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,30]],"date-time":"2026-03-30T13:11:08Z","timestamp":1774876268420,"version":"3.50.1"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,7,21]],"date-time":"2023-07-21T00:00:00Z","timestamp":1689897600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,7,21]],"date-time":"2023-07-21T00:00:00Z","timestamp":1689897600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["PTDC\/BTMTEC\/ 0367\/2021"],"award-info":[{"award-number":["PTDC\/BTMTEC\/ 0367\/2021"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["2021.05767.BD"],"award-info":[{"award-number":["2021.05767.BD"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["CEECIND\/01854\/2017"],"award-info":[{"award-number":["CEECIND\/01854\/2017"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["PTDC\/BTMTEC\/ 0367\/2021"],"award-info":[{"award-number":["PTDC\/BTMTEC\/ 0367\/2021"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["PTDC\/BTMTEC\/ 0367\/2021"],"award-info":[{"award-number":["PTDC\/BTMTEC\/ 0367\/2021"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia","doi-asserted-by":"publisher","award":["PTDC\/BTMTEC\/ 0367\/2021"],"award-info":[{"award-number":["PTDC\/BTMTEC\/ 0367\/2021"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Sci Rep"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Emerging evidence of the relationship between the microbiome composition and the development of numerous diseases, including cancer, has led to an increasing interest in the study of the human microbiome. Technological breakthroughs regarding DNA sequencing methods propelled microbiome studies with a large number of samples, which called for the necessity of more sophisticated data-analytical tools to analyze this complex relationship. The aim of this work was to develop a machine learning-based approach to distinguish the type of cancer based on the analysis of the tissue-specific microbial information, assessing the human microbiome as valuable predictive information for cancer identification. For this purpose, Random Forest algorithms were trained for the classification of five types of cancer\u2014head and neck, esophageal, stomach, colon, and rectum cancers\u2014with samples provided by The Cancer Microbiome Atlas database. One versus all and multi-class classification studies were conducted to evaluate the discriminative capability of the microbial data across increasing levels of cancer site specificity, with results showing a progressive rise in difficulty for accurate sample classification. Random Forest models achieved promising performances when predicting head and neck, stomach, and colon cancer cases, with the latter returning accuracy scores above 90% across the different studies conducted. However, there was also an increased difficulty when discriminating esophageal and rectum cancers, failing to differentiate with adequate results rectum from colon cancer cases, and esophageal from head and neck and stomach cancers. These results point to the fact that anatomically adjacent cancers can be more complex to identify due to microbial similarities. Despite the limitations, microbiome data analysis using machine learning may advance novel strategies to improve cancer detection and prevention, and decrease disease burden.<\/jats:p>","DOI":"10.1038\/s41598-023-38670-0","type":"journal-article","created":{"date-parts":[[2023,7,21]],"date-time":"2023-07-21T16:04:15Z","timestamp":1689955455000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":40,"title":["Machine learning-based approaches for cancer prediction using microbiome data"],"prefix":"10.1038","volume":"13","author":[{"given":"Pedro","family":"Freitas","sequence":"first","affiliation":[]},{"given":"Francisco","family":"Silva","sequence":"additional","affiliation":[]},{"given":"Joana Vale","family":"Sousa","sequence":"additional","affiliation":[]},{"given":"Rui M.","family":"Ferreira","sequence":"additional","affiliation":[]},{"given":"C\u00e9u","family":"Figueiredo","sequence":"additional","affiliation":[]},{"given":"Tania","family":"Pereira","sequence":"additional","affiliation":[]},{"given":"H\u00e9lder P.","family":"Oliveira","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,7,21]]},"reference":[{"key":"38670_CR1","doi-asserted-by":"publisher","first-page":"209","DOI":"10.3322\/caac.21660","volume":"71","author":"H Sung","year":"2021","unstructured":"Sung, H. et al. Global cancer statistics 2020: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209\u2013249 (2021).","journal-title":"CA Cancer J. Clin."},{"key":"38670_CR2","doi-asserted-by":"publisher","first-page":"e180","DOI":"10.1016\/S2214-109X(19)30488-7","volume":"8","author":"C de Martel","year":"2020","unstructured":"de Martel, C., Georges, D., Bray, F., Ferlay, J. & Clifford, G. M. Global burden of cancer attributable to infections in 2018: A worldwide incidence analysis. Lancet Glob. Health 8, e180\u2013e190 (2020).","journal-title":"Lancet Glob. Health"},{"key":"38670_CR3","doi-asserted-by":"publisher","first-page":"392","DOI":"10.1038\/nm.4517","volume":"24","author":"JA Gilbert","year":"2018","unstructured":"Gilbert, J. A. et al. Current understanding of the human microbiome. Nat. Med. 24, 392\u2013400 (2018).","journal-title":"Nat. Med."},{"key":"38670_CR4","doi-asserted-by":"publisher","first-page":"299","DOI":"10.1101\/gr.126516.111","volume":"22","author":"M Castellarin","year":"2012","unstructured":"Castellarin, M. et al. Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma. Genome Res. 22, 299\u2013306 (2012).","journal-title":"Genome Res."},{"key":"38670_CR5","doi-asserted-by":"publisher","first-page":"226","DOI":"10.1136\/gutjnl-2017-314205","volume":"67","author":"RM Ferreira","year":"2018","unstructured":"Ferreira, R. M. et al. Gastric microbial community profiling reveals a dysbiotic cancer-associated microbiota. Gut 67, 226\u2013236 (2018).","journal-title":"Gut"},{"key":"38670_CR6","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/srep30751","volume":"6","author":"TJ Hieken","year":"2016","unstructured":"Hieken, T. J. et al. The microbiome of aseptically collected human breast tissue in benign and malignant disease. Sci. Rep. 6, 1\u201310 (2016).","journal-title":"Sci. Rep."},{"key":"38670_CR7","doi-asserted-by":"publisher","first-page":"1030","DOI":"10.1080\/19490976.2020.1737487","volume":"11","author":"Y Zheng","year":"2020","unstructured":"Zheng, Y. et al. Specific gut microbiome signature predicts the early-stage lung cancer. Gut Microbes 11, 1030\u20131042 (2020).","journal-title":"Gut Microbes"},{"key":"38670_CR8","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1007\/5584_2019_366","volume":"11","author":"J Pereira-Marques","year":"2019","unstructured":"Pereira-Marques, J., Ferreira, R. M., Pinto-Ribeiro, I. & Figueiredo, C. Helicobacter pylori infection, the gastric microbiome and gastric cancer. Helicobacter pylori Hum. Dis. 11, 195\u2013210 (2019).","journal-title":"Helicobacter pylori Hum. Dis."},{"key":"38670_CR9","doi-asserted-by":"publisher","first-page":"eaan5931","DOI":"10.1126\/science.aan5931","volume":"360","author":"C Ma","year":"2018","unstructured":"Ma, C. et al. Gut microbiome-mediated bile acid metabolism regulates liver cancer via NKT cells. Science 360, eaan5931 (2018).","journal-title":"Science"},{"key":"38670_CR10","doi-asserted-by":"publisher","first-page":"631","DOI":"10.1016\/j.csbj.2020.03.003","volume":"18","author":"RM Rodriguez","year":"2020","unstructured":"Rodriguez, R. M., Hernandez, B. Y., Menor, M., Deng, Y. & Khadka, V. S. The landscape of bacterial presence in tumor and adjacent normal tissue across 9 major cancer types using TCGA exome sequencing. Comput. Struct. Biotechnol. J. 18, 631\u2013641 (2020).","journal-title":"Comput. Struct. Biotechnol. J."},{"key":"38670_CR11","doi-asserted-by":"publisher","first-page":"973","DOI":"10.1126\/science.aay9189","volume":"368","author":"D Nejman","year":"2020","unstructured":"Nejman, D. et al. The human tumor microbiome is composed of tumor type-specific intracellular bacteria. Science 368, 973\u2013980 (2020).","journal-title":"Science"},{"key":"38670_CR12","doi-asserted-by":"publisher","first-page":"3789","DOI":"10.1016\/j.cell.2022.09.005","volume":"185","author":"L Narunsky-Haziza","year":"2022","unstructured":"Narunsky-Haziza, L. et al. Pan-cancer analyses reveal cancer-type-specific fungal ecologies and bacteriome interactions. Cell 185, 3789\u20133806 (2022).","journal-title":"Cell"},{"key":"38670_CR13","doi-asserted-by":"publisher","first-page":"3807","DOI":"10.1016\/j.cell.2022.09.015","volume":"185","author":"AB Dohlman","year":"2022","unstructured":"Dohlman, A. B. et al. A pan-cancer mycobiome analysis reveals fungal involvement in gastrointestinal and lung tumors. Cell 185, 3807\u20133822 (2022).","journal-title":"Cell"},{"key":"38670_CR14","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1016\/j.tim.2018.11.003","volume":"27","author":"R Eisenhofer","year":"2019","unstructured":"Eisenhofer, R. et al. Contamination in low microbial biomass microbiome studies: Issues and recommendations. Trends Microbiol. 27, 105\u2013117 (2019).","journal-title":"Trends Microbiol."},{"key":"38670_CR15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12915-014-0087-z","volume":"12","author":"SJ Salter","year":"2014","unstructured":"Salter, S. J. et al. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol. 12, 1\u201312 (2014).","journal-title":"BMC Biol."},{"key":"38670_CR16","doi-asserted-by":"publisher","DOI":"10.1155\/2018\/2936257","author":"H Wu","year":"2018","unstructured":"Wu, H. et al. Metagenomics biomarkers selected for prediction of three different diseases in Chinese population. BioMed Res. Int.https:\/\/doi.org\/10.1155\/2018\/2936257 (2018).","journal-title":"BioMed Res. Int."},{"key":"38670_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13073-016-0290-3","volume":"8","author":"NT Baxter","year":"2016","unstructured":"Baxter, N. T., Ruffin, M. T., Rogers, M. A. & Schloss, P. D. Microbiota-based model improves the sensitivity of fecal immunochemical test for detecting colonic lesions. Genome Med. 8, 1\u201310 (2016).","journal-title":"Genome Med."},{"key":"38670_CR18","doi-asserted-by":"publisher","first-page":"567","DOI":"10.1038\/s41586-020-2095-1","volume":"579","author":"GD Poore","year":"2020","unstructured":"Poore, G. D. et al. Microbiome analyses of blood and tissues suggest cancer diagnostic approach. Nature 579, 567\u2013574 (2020).","journal-title":"Nature"},{"key":"38670_CR19","doi-asserted-by":"publisher","first-page":"e000297","DOI":"10.1136\/bmjgast-2019-000297","volume":"6","author":"E Dadkhah","year":"2019","unstructured":"Dadkhah, E. et al. Gut microbiome identifies risk for colorectal polyps. BMJ Open Gastroenterol. 6, e000297 (2019).","journal-title":"BMJ Open Gastroenterol."},{"key":"38670_CR20","doi-asserted-by":"publisher","first-page":"281","DOI":"10.1016\/j.chom.2020.12.001","volume":"29","author":"AB Dohlman","year":"2021","unstructured":"Dohlman, A. B. et al. The cancer microbiome atlas: A pan-cancer comparative analysis to distinguish tissue-resident microbiota from contaminants. Cell Host Microbe 29, 281\u2013298 (2021).","journal-title":"Cell Host Microbe"},{"key":"38670_CR21","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825\u20132830 (2011).","journal-title":"J. Mach. Learn. Res."},{"key":"38670_CR22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-020-20314-w","volume":"12","author":"Y Wu","year":"2021","unstructured":"Wu, Y. et al. Identification of microbial markers across populations in early detection of colorectal cancer. Nat. Commun. 12, 1\u201313 (2021).","journal-title":"Nat. Commun."},{"key":"38670_CR23","doi-asserted-by":"publisher","first-page":"1406","DOI":"10.3390\/cancers12061406","volume":"12","author":"L S\u00e1nchez-Alcoholado","year":"2020","unstructured":"S\u00e1nchez-Alcoholado, L. et al. The role of the gut microbiome in colorectal cancer development and therapy response. Cancers 12, 1406 (2020).","journal-title":"Cancers"},{"key":"38670_CR24","volume-title":"Advances in Neural Information Processing Systems","author":"SM Lundberg","year":"2017","unstructured":"Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (eds Guyon, I. et al.) (Curran Associates Inc., 2017)."},{"key":"38670_CR25","doi-asserted-by":"publisher","first-page":"3385","DOI":"10.3390\/cancers14143385","volume":"14","author":"CP da Costa","year":"2022","unstructured":"da Costa, C. P. et al. The tissue-associated microbiota in colorectal cancer: A systematic review. Cancers 14, 3385 (2022).","journal-title":"Cancers"},{"key":"38670_CR26","doi-asserted-by":"publisher","first-page":"11553","DOI":"10.2147\/CMAR.S275316","volume":"12","author":"Y Wang","year":"2020","unstructured":"Wang, Y. et al. Analyses of potential driver and passenger bacteria in human colorectal cancer. Cancer Manag. Res. 12, 11553 (2020).","journal-title":"Cancer Manag. Res."},{"key":"38670_CR27","doi-asserted-by":"publisher","first-page":"2745","DOI":"10.1158\/0008-5472.CAN-20-3827","volume":"81","author":"X Wang","year":"2021","unstructured":"Wang, X. et al. Porphyromonas gingivalis promotes colorectal carcinoma by activating the hematopoietic NLRP3 inflammasomeporphyromonas gingivalis promotes colorectal carcinoma. Can. Res. 81, 2745\u20132759 (2021).","journal-title":"Can. Res."},{"key":"38670_CR28","doi-asserted-by":"publisher","first-page":"584798","DOI":"10.3389\/fcimb.2020.584798","volume":"10","author":"W Mu","year":"2020","unstructured":"Mu, W. et al. Intracellular porphyromonas gingivalis promotes the proliferation of colorectal cancer cells via the MAPK\/ERK signaling pathway. Front. Cell. Infect. Microbiol. 10, 584798 (2020).","journal-title":"Front. Cell. Infect. Microbiol."},{"key":"38670_CR29","doi-asserted-by":"publisher","first-page":"790997","DOI":"10.3389\/fonc.2022.790997","volume":"12","author":"Y Huang","year":"2022","unstructured":"Huang, Y. et al. Is laryngeal squamous cell carcinoma related to Helicobacter pylori?. Front. Oncol. 12, 790997 (2022).","journal-title":"Front. Oncol."},{"key":"38670_CR30","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-020-65694-7","volume":"10","author":"S Pandey","year":"2020","unstructured":"Pandey, S. et al. Helicobacter pylori was not detected in oral squamous cell carcinomas from cohorts of Norwegian and Nepalese patients. Sci. Rep. 10, 1\u20138 (2020).","journal-title":"Sci. Rep."},{"key":"38670_CR31","doi-asserted-by":"crossref","unstructured":"Figueiredo, C. et\u00a0al. Pathogenesis of gastric cancer: genetics and molecular classification. Molecular Pathogenesis and Signal Transduction by Helicobacter pylori 277\u2013304 (2017).","DOI":"10.1007\/978-3-319-50520-6_12"},{"key":"38670_CR32","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1007\/s00535-017-1375-5","volume":"53","author":"C Castro","year":"2018","unstructured":"Castro, C., Peleteiro, B. & Lunet, N. Modifiable factors and esophageal cancer: A systematic review of published meta-analyses. J. Gastroenterol. 53, 37\u201351 (2018).","journal-title":"J. Gastroenterol."},{"key":"38670_CR33","doi-asserted-by":"publisher","first-page":"582","DOI":"10.1111\/apt.15650","volume":"51","author":"M Rajilic-Stojanovic","year":"2020","unstructured":"Rajilic-Stojanovic, M. et al. Systematic review: gastric microbiota in health and disease. Aliment. Pharmacol. Therapeutics 51, 582\u2013602 (2020).","journal-title":"Aliment. Pharmacol. Therapeutics"},{"key":"38670_CR34","doi-asserted-by":"publisher","first-page":"188309","DOI":"10.1016\/j.bbcan.2019.07.004","volume":"1872","author":"K Vinasco","year":"2019","unstructured":"Vinasco, K., Mitchell, H. M., Kaakoush, N. O. & Casta\u00f1o-Rodr\u00edguez, N. Microbial carcinogenesis: Lactic acid bacteria in gastric cancer. Biochim. Biophys. Acta BBA-Rev. Cancer 1872, 188309 (2019).","journal-title":"Biochim. Biophys. Acta BBA-Rev. Cancer"},{"key":"38670_CR35","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1016\/S2468-1253(16)30086-3","volume":"2","author":"DRF Elliott","year":"2017","unstructured":"Elliott, D. R. F., Walker, A. W., O\u2019Donovan, M., Parkhill, J. & Fitzgerald, R. C. A non-endoscopic device to sample the oesophageal microbiota: A case-control study. Lancet Gastroenterol. Hepatol. 2, 32\u201342 (2017).","journal-title":"Lancet Gastroenterol. Hepatol."},{"key":"38670_CR36","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12885-021-08903-4","volume":"21","author":"E McIlvanna","year":"2021","unstructured":"McIlvanna, E., Linden, G. J., Craig, S. G., Lundy, F. T. & James, J. A. Fusobacterium nucleatum and oral cancer: A critical review. BMC Cancer 21, 1\u201311 (2021).","journal-title":"BMC Cancer"},{"key":"38670_CR37","doi-asserted-by":"publisher","first-page":"104669","DOI":"10.1016\/j.archoralbio.2020.104669","volume":"112","author":"JD Bronzato","year":"2020","unstructured":"Bronzato, J. D. et al. Detection of fusobacterium in oral and head and neck cancer samples: A systematic review and meta-analysis. Arch. Oral Biol. 112, 104669 (2020).","journal-title":"Arch. Oral Biol."},{"key":"38670_CR38","doi-asserted-by":"publisher","first-page":"269","DOI":"10.3390\/cancers15010269","volume":"15","author":"Y-Y Hsieh","year":"2022","unstructured":"Hsieh, Y.-Y., Kuo, W.-L., Hsu, W.-T., Tung, S.-Y. & Li, C. Fusobacterium nucleatum-induced tumor mutation burden predicts poor survival of gastric cancer patients. Cancers 15, 269 (2022).","journal-title":"Cancers"},{"key":"38670_CR39","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1016\/j.canlet.2022.01.014","volume":"530","author":"D Nomoto","year":"2022","unstructured":"Nomoto, D. et al. Fusobacterium nucleatum promotes esophageal squamous cell carcinoma progression via the NOD1\/RIPK2\/nf-$$\\kappa$$b pathway. Cancer Lett. 530, 59\u201367 (2022).","journal-title":"Cancer Lett."}],"container-title":["Scientific Reports"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s41598-023-38670-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41598-023-38670-0","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s41598-023-38670-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,21]],"date-time":"2023-07-21T16:13:55Z","timestamp":1689956035000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s41598-023-38670-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,21]]},"references-count":39,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["38670"],"URL":"https:\/\/doi.org\/10.1038\/s41598-023-38670-0","relation":{},"ISSN":["2045-2322"],"issn-type":[{"value":"2045-2322","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,21]]},"assertion":[{"value":"31 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 July 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 July 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"11821"}}