{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T16:21:46Z","timestamp":1774282906524,"version":"3.50.1"},"reference-count":28,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2024,9,12]],"date-time":"2024-09-12T00:00:00Z","timestamp":1726099200000},"content-version":"vor","delay-in-days":73,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["#1849206"],"award-info":[{"award-number":["#1849206"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["# 1920954"],"award-info":[{"award-number":["# 1920954"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Institutional Development Award"},{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["P20GM103443"],"award-info":[{"award-number":["P20GM103443"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,7,23]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>In an environment, microbes often work in communities to achieve most of their essential functions, including the production of essential nutrients. Microbial biofilms are communities of microbes that attach to a nonliving or living surface by embedding themselves into a self-secreted matrix of extracellular polymeric substances. These communities work together to enhance their colonization of surfaces, produce essential nutrients, and achieve their essential functions for growth and survival. They often consist of diverse microbes including bacteria, viruses, and fungi. Biofilms play a critical role in influencing plant phenotypes and human microbial infections. Understanding how these biofilms impact plant health, human health, and the environment is important for analyzing genotype\u2013phenotype-driven rule-of-life functions. Such fundamental knowledge can be used to precisely control the growth of biofilms on a given surface. Metagenomics is a powerful tool for analyzing biofilm genomes through function-based gene and protein sequence identification (functional metagenomics) and sequence-based function identification (sequence metagenomics). Metagenomic sequencing enables a comprehensive sampling of all genes in all organisms present within a biofilm sample. However, the complexity of biofilm metagenomic study warrants the increasing need to follow the Findability, Accessibility, Interoperability, and Reusable (FAIR) Guiding Principles for scientific data management. This will ensure that scientific findings can be more easily validated by the research community. This study proposes a dockerized, self-learning bioinformatics workflow to increase the community adoption of metagenomics toolkits in a metagenomics and meta-transcriptomics investigation. Our biofilm metagenomics workflow self-learning module includes integrated learning resources with an interactive dockerized workflow. This module will allow learners to analyze resources that are beneficial for aggregating knowledge about biofilm marker genes, proteins, and metabolic pathways as they define the composition of specific microbial communities. Cloud and dockerized technology can allow novice learners\u2014even those with minimal knowledge in computer science\u2014to use complicated bioinformatics tools. Our cloud-based, dockerized workflow splits biofilm microbiome metagenomics analyses into four easy-to-follow submodules. A variety of tools are built into each submodule. As students navigate these submodules, they learn about each tool used to accomplish the task. The downstream analysis is conducted using processed data obtained from online resources or raw data processed via Nextflow pipelines. This analysis takes place within Vertex AI\u2019s Jupyter notebook instance with R and Python kernels. Subsequently, results are stored and visualized in Google Cloud storage buckets, alleviating the computational burden on local resources. The result is a comprehensive tutorial that guides bioinformaticians of any skill level through the entire workflow. It enables them to comprehend and implement the necessary processes involved in this integrated workflow from start to finish.<\/jats:p>\n               <jats:p>This manuscript describes the development of a resource module that is part of a learning platform named \u201dNIGMS Sandbox for Cloud-based Learning\u201d https:\/\/github.com\/NIGMS\/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.<\/jats:p>","DOI":"10.1093\/bib\/bbae429","type":"journal-article","created":{"date-parts":[[2024,9,13]],"date-time":"2024-09-13T05:49:57Z","timestamp":1726206597000},"source":"Crossref","is-referenced-by-count":5,"title":["Biofilm marker discovery with cloud-based dockerized metagenomics analysis of microbial communities"],"prefix":"10.1093","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5338-084X","authenticated-orcid":false,"given":"Etienne Z","family":"Gnimpieba","sequence":"first","affiliation":[{"name":"Biomedical Engineering Department , University of South Dakota, 4800 N. Career Ave., Suite 221, Sioux Falls, South Dakota, 57107, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4127-9672","authenticated-orcid":false,"given":"Timothy W","family":"Hartman","sequence":"additional","affiliation":[{"name":"Biomedical Engineering Department , University of South Dakota, 4800 N. Career Ave., Suite 221, Sioux Falls, South Dakota, 57107, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1817-4565","authenticated-orcid":false,"given":"Tuyen","family":"Do","sequence":"additional","affiliation":[{"name":"Biomedical Engineering Department , University of South Dakota, 4800 N. Career Ave., Suite 221, Sioux Falls, South Dakota, 57107, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0923-5397","authenticated-orcid":false,"given":"Jessica","family":"Zylla","sequence":"additional","affiliation":[{"name":"Biomedical Engineering Department , University of South Dakota, 4800 N. Career Ave., Suite 221, Sioux Falls, South Dakota, 57107, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8537-1115","authenticated-orcid":false,"given":"Shiva","family":"Aryal","sequence":"additional","affiliation":[{"name":"Biomedical Engineering Department , University of South Dakota, 4800 N. Career Ave., Suite 221, Sioux Falls, South Dakota, 57107, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2557-2358","authenticated-orcid":false,"given":"Samuel J","family":"Haas","sequence":"additional","affiliation":[{"name":"Biomedical Engineering Department , University of South Dakota, 4800 N. Career Ave., Suite 221, Sioux Falls, South Dakota, 57107, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3634-8370","authenticated-orcid":false,"given":"Diing D M","family":"Agany","sequence":"additional","affiliation":[{"name":"Biomedical Engineering Department , University of South Dakota, 4800 N. Career Ave., Suite 221, Sioux Falls, South Dakota, 57107, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9818-5108","authenticated-orcid":false,"given":"Bichar Dip Shrestha","family":"Gurung","sequence":"additional","affiliation":[{"name":"Biomedical Engineering Department , University of South Dakota, 4800 N. Career Ave., Suite 221, Sioux Falls, South Dakota, 57107, United States"}]},{"given":"Valena","family":"Doe","sequence":"additional","affiliation":[{"name":"Google Cloud , 1900 Reston Metro Plaza, Reston, Virginia, 20190, United States"}]},{"given":"Zelaikha","family":"Yosufzai","sequence":"additional","affiliation":[{"name":"Health Data and AI , Deloitte Consulting LLP, 1919 N Lynn St., Suite 1500, Arlington, Virginia, 22209, United States"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-4829-2877","authenticated-orcid":false,"given":"Daniel","family":"Pan","sequence":"additional","affiliation":[{"name":"Health Data and AI , Deloitte Consulting LLP, 1919 N Lynn St., Suite 1500, Arlington, Virginia, 22209, United States"}]},{"given":"Ross","family":"Campbell","sequence":"additional","affiliation":[{"name":"Health Data and AI , Deloitte Consulting LLP, 1919 N Lynn St., Suite 1500, Arlington, Virginia, 22209, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2555-1322","authenticated-orcid":false,"given":"Victor C","family":"Huber","sequence":"additional","affiliation":[{"name":"Basic Biomedical Sciences Division , University of South Dakota, 414 E. Clark St, Vermillion, South Dakota, 57069, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5493-252X","authenticated-orcid":false,"given":"Rajesh","family":"Sani","sequence":"additional","affiliation":[{"name":"South Dakota School of Mines & Technology , 501 E. Saint Joseph St., Rapid City, South Dakota, 57701, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8418-3515","authenticated-orcid":false,"given":"Venkataramana","family":"Gadhamshetty","sequence":"additional","affiliation":[{"name":"South Dakota School of Mines & Technology , 501 E. Saint Joseph St., Rapid City, South Dakota, 57701, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3838-5690","authenticated-orcid":false,"given":"Carol","family":"Lushbough","sequence":"additional","affiliation":[{"name":"Biomedical Engineering Department , University of South Dakota, 4800 N. Career Ave., Suite 221, Sioux Falls, South Dakota, 57107, United States"}]}],"member":"286","published-online":{"date-parts":[[2024,9,11]]},"reference":[{"key":"2024091223382938900_ref1","doi-asserted-by":"publisher","DOI":"10.3389\/fmicb.2016.00592","article-title":"Anti-biofilm activity as a health issue","volume":"7","author":"Miquel","year":"2016","journal-title":"Front Microbiol"},{"key":"2024091223382938900_ref2","doi-asserted-by":"publisher","DOI":"10.5808\/GI.2019.17.1.e6","article-title":"Statistical analysis of metagenomics data","volume":"17","author":"Luz, Calle","year":"2019","journal-title":"Genomics Inf"},{"key":"2024091223382938900_ref3","doi-asserted-by":"publisher","first-page":"7298","DOI":"10.1128\/AEM.69.12.7298-7309.2003","article-title":"Metagenome survey of biofilms in drinking-water networks","volume":"69","author":"Schmeisser","year":"2003","journal-title":"Appl Environ Microbiol"},{"key":"2024091223382938900_ref4","doi-asserted-by":"publisher","first-page":"113102","DOI":"10.1016\/j.envres.2022.113102","article-title":"Omics approaches in bioremediation of environmental contaminants: An integrated approach for environmental safety and sustainability","volume":"211","author":"Sharma","year":"2022","journal-title":"Environ Res"},{"key":"2024091223382938900_ref5","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1038\/nrg3575","article-title":"Systems genetics approaches to understand complex traits","volume":"15","author":"Civelek","year":"2014","journal-title":"Nat Rev Genet"},{"key":"2024091223382938900_ref6","doi-asserted-by":"publisher","first-page":"891","DOI":"10.1093\/bib\/bbv090","article-title":"Transcriptomic and metabolomic data integration","volume":"17","author":"Cavill","year":"2016","journal-title":"Brief Bioinform"},{"key":"2024091223382938900_ref7","doi-asserted-by":"publisher","first-page":"669","DOI":"10.1093\/bib\/bbs054","article-title":"Classification of metagenomic sequences: Methods and challenges","volume":"13","author":"Mande","year":"2012","journal-title":"Brief Bioinform"},{"key":"2024091223382938900_ref8","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1016\/j.csbj.2016.11.005","article-title":"Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics","volume":"15","author":"Sedlar","year":"2016","journal-title":"Comput Struct Biotechnol J"},{"key":"2024091223382938900_ref9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-020-03815-9","article-title":"MetaLAFFA: A flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline","volume":"21","author":"Eng","year":"2020","journal-title":"BMC Bioinformatics"},{"key":"2024091223382938900_ref10","article-title":"An introduction to docker and analysis of its performance","volume":"17","author":"Rad","year":"2017","journal-title":"IJCSNS Int J Comput Sci Network Secur"},{"key":"2024091223382938900_ref11","doi-asserted-by":"crossref","DOI":"10.1109\/JCDL.2017.7991618","article-title":"Using the Jupyter notebook as a tool for Open Science: An empirical study","volume-title":"Proceedings of the ACM\/IEEE Joint Conference on Digital Libraries","author":"Randles","year":"2017"},{"key":"2024091223382938900_ref12","first-page":"3137","article-title":"FQC dashboard: Integrates FastQC results into a web-based, interactive, and extensible FASTQ quality control tool","volume":"33","author":"Brown","year":"2017","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2024091223382938900_ref13","first-page":"3047","article-title":"MultiQC: Summarize analysis results for multiple tools and samples in a single report","volume":"32","author":"Ewels","year":"2016","journal-title":"Nat Methods"},{"key":"2024091223382938900_ref14","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1007\/978-1-0716-2067-0_11","article-title":"Trimming and validation of Illumina short reads using Trimmomatic, trinity assembly, and assessment of RNA-Seq data","volume":"2443","author":"Sewe","year":"2022","journal-title":"Methods Mol Biol"},{"key":"2024091223382938900_ref15","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1007\/978-1-4939-8728-3_8","article-title":"16S rRNA gene analysis with QIIME2","volume":"1849","author":"Hall","year":"2018","journal-title":"Methods Mol Biol"},{"key":"2024091223382938900_ref16","doi-asserted-by":"publisher","first-page":"685","DOI":"10.1038\/s41587-020-0548-6","article-title":"PICRUSt2 for prediction of metagenome functions","volume":"38","author":"Douglas","year":"2020","journal-title":"Nat Biotechnol"},{"key":"2024091223382938900_ref17","doi-asserted-by":"publisher","first-page":"485","DOI":"10.1007\/978-1-4842-4470-8_38","article-title":"Google BigQuery","volume-title":"Building Machine Learning and Deep Learning Models on Google Cloud Platform Apress","author":"Bisong","year":"2019"},{"key":"2024091223382938900_ref18","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2629691","article-title":"NCBI BLASTP on high-performance reconfigurable computing systems","volume":"7","author":"Mahram","year":"2015","journal-title":"ACM Trans. Reconfigurable Technol. Syst"},{"key":"2024091223382938900_ref19","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1007\/978-1-4842-4470-8_2","article-title":"An overview of Google cloud platform services","volume-title":"Building Machine Learning and Deep Learning Models on Google Cloud Platform Apress","author":"Bisong","year":"2019"},{"key":"2024091223382938900_ref20","doi-asserted-by":"publisher","first-page":"R50","DOI":"10.1186\/gb-2011-12-5-r50","article-title":"Moving pictures of the human microbiome","volume":"12","author":"Gregory Caporaso","year":"2011","journal-title":"Genome Biol"},{"key":"2024091223382938900_ref21","doi-asserted-by":"publisher","first-page":"1694","DOI":"10.1126\/science.1177486","article-title":"Bacterial community variation in human body habitats across space and time","volume":"326","author":"Costello","year":"2009","journal-title":"Science (New York, NY)"},{"key":"2024091223382938900_ref22","doi-asserted-by":"publisher","first-page":"646","DOI":"10.1093\/bib\/bbs031","article-title":"Taxonomic binning of metagenome samples generated by next-generation sequencing technologies","volume":"13","author":"Dr\u00f6ge","year":"2012","journal-title":"Brief Bioinform"},{"key":"2024091223382938900_ref23","doi-asserted-by":"publisher","first-page":"e100","DOI":"10.1002\/cpbi.100","article-title":"QIIME 2 enables comprehensive end-to-end analysis of diverse microbiome data and comparative studies with publicly available data","volume":"70","author":"Estaki","year":"2020","journal-title":"Curr Protoc Bioinformatics"},{"key":"2024091223382938900_ref24","doi-asserted-by":"publisher","first-page":"188","DOI":"10.1186\/s13059-020-02084-2","article-title":"GMM-Demux: Sample demultiplexing, multiplet detection, experiment planning, and novel cell-type verification in single cell sequencing","volume":"21","author":"Xin","year":"2020","journal-title":"Genome Biol"},{"key":"2024091223382938900_ref25","doi-asserted-by":"publisher","first-page":"581","DOI":"10.1038\/nmeth.3869","article-title":"DADA2: High-resolution sample inference from Illumina amplicon data","volume":"13","author":"Callahan","year":"2016","journal-title":"Nat Methods"},{"key":"2024091223382938900_ref26","doi-asserted-by":"publisher","DOI":"10.1128\/mBio.02331-17","article-title":"Bacterial quorum sensing and microbial community interactions","volume":"9","author":"Abisado","year":"2018","journal-title":"MBio"},{"key":"2024091223382938900_ref27","doi-asserted-by":"publisher","first-page":"258","DOI":"10.1093\/nar\/gkg034","article-title":"STRING: A database of predicted functional associations between proteins","volume":"31","author":"von Mering","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2024091223382938900_ref28","doi-asserted-by":"publisher","first-page":"937","DOI":"10.1016\/j.tibtech.2020.04.002","article-title":"The biofilms structural database","volume":"38","author":"Magalh\u00e3es","year":"2020","journal-title":"Trends Biotechnol"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/Supplement_1\/bbae429\/59089895\/bbae429.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/Supplement_1\/bbae429\/59089895\/bbae429.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,13]],"date-time":"2024-09-13T05:50:09Z","timestamp":1726206609000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae429\/7755369"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7]]},"references-count":28,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2024,7,23]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae429","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,7]]},"published":{"date-parts":[[2024,7]]},"article-number":"bbae429"}}