{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T19:53:52Z","timestamp":1777406032712,"version":"3.51.4"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1011676","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2023,12,7]],"date-time":"2023-12-07T00:00:00Z","timestamp":1701907200000}}],"reference-count":31,"publisher":"Public Library of Science (PLoS)","issue":"11","license":[{"start":{"date-parts":[[2023,11,27]],"date-time":"2023-11-27T00:00:00Z","timestamp":1701043200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"publisher","award":["1U24CA248454-01"],"award-info":[{"award-number":["1U24CA248454-01"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Study reproducibility is essential to corroborate, build on, and learn from the results of scientific research but is notoriously challenging in bioinformatics, which often involves large data sets and complex analytic workflows involving many different tools. Additionally, many biologists are not trained in how to effectively record their bioinformatics analysis steps to ensure reproducibility, so critical information is often missing. Software tools used in bioinformatics can automate provenance tracking of the results they generate, removing most barriers to bioinformatics reproducibility. Here we present an implementation of that idea, Provenance Replay, a tool for generating new executable code from results generated with the QIIME 2 bioinformatics platform, and discuss considerations for bioinformatics developers who wish to implement similar functionality in their software.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1011676","type":"journal-article","created":{"date-parts":[[2023,11,27]],"date-time":"2023-11-27T19:24:30Z","timestamp":1701113070000},"page":"e1011676","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":13,"title":["Facilitating bioinformatics reproducibility with QIIME 2 Provenance Replay"],"prefix":"10.1371","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4289-0991","authenticated-orcid":true,"given":"Christopher R.","family":"Keefe","sequence":"first","affiliation":[]},{"given":"Matthew R.","family":"Dillon","sequence":"additional","affiliation":[]},{"given":"Elizabeth","family":"Gehret","sequence":"additional","affiliation":[]},{"given":"Chloe","family":"Herman","sequence":"additional","affiliation":[]},{"given":"Mary","family":"Jewell","sequence":"additional","affiliation":[]},{"given":"Colin V.","family":"Wood","sequence":"additional","affiliation":[]},{"given":"Evan","family":"Bolyen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8865-1670","authenticated-orcid":true,"given":"J. Gregory","family":"Caporaso","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2023,11,27]]},"reference":[{"key":"pcbi.1011676.ref001","first-page":"1","article-title":"Social, behavioral, and economic sciences perspectives on robust and reliable science","author":"JT Cacioppo","year":"2015","journal-title":"Report of the Subcommittee on Replicability in Science Advisory Committee to the National Science Foundation Directorate for Social, Behavioral, and Economic Sciences."},{"key":"pcbi.1011676.ref002","author":"University of California Museum of Paleontology","year":"2022","journal-title":"How Science Works. Understanding Science"},{"key":"pcbi.1011676.ref003","author":"MS Gazzaniga","year":"2018","journal-title":"Psychological science 2018. 6th ed"},{"key":"pcbi.1011676.ref004","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1087\/20150104","article-title":"Peer review: still king in the digital age.","volume":"28","author":"D Nicholas","year":"2015","journal-title":"Learn Publ"},{"key":"pcbi.1011676.ref005","doi-asserted-by":"crossref","first-page":"aac4716","DOI":"10.1126\/science.aac4716","article-title":"Estimating the reproducibility of psychological science.","volume":"349","author":"Open Science Collaboration","year":"2015","journal-title":"Science"},{"key":"pcbi.1011676.ref006","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1038\/533452a","article-title":"1,500 scientists lift the lid on reproducibility","volume":"533","author":"M. Baker","year":"2016","journal-title":"Nature"},{"key":"pcbi.1011676.ref007","unstructured":"The Turing Way Community. The Turing Way: A handbook for reproducible, ethical and collaborative research. doi: 10.5281\/zenodo.7625728"},{"key":"pcbi.1011676.ref008","author":"OE Gundersen","year":"2018","journal-title":"State of the Art: Reproducibility in Artificial Intelligence"},{"key":"pcbi.1011676.ref009","doi-asserted-by":"crossref","first-page":"2632","DOI":"10.1073\/pnas.1711786114","article-title":"Scientific progress despite irreproducibility: A seeming paradox","volume":"115","author":"RM Shiffrin","year":"2018","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"pcbi.1011676.ref010","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1007\/11890850_16","volume-title":"Provenance and Annotation of Data.","author":"Y Zhao","year":"2006"},{"key":"pcbi.1011676.ref011","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41562-016-0021","article-title":"A manifesto for reproducible science","volume":"1","author":"MR Munaf\u00f2","year":"2017","journal-title":"Nature Human Behaviour"},{"key":"pcbi.1011676.ref012","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1126\/science.1179653","article-title":"Computer science. Accessible reproducible research","volume":"327","author":"JP Mesirov","year":"2010","journal-title":"Science"},{"key":"pcbi.1011676.ref013","doi-asserted-by":"crossref","first-page":"2520","DOI":"10.1093\/bioinformatics\/bts480","article-title":"Snakemake\u2014a scalable bioinformatics workflow engine","volume":"28","author":"J K\u00f6ster","year":"2012","journal-title":"Bioinformatics"},{"key":"pcbi.1011676.ref014","doi-asserted-by":"crossref","first-page":"e1007664","DOI":"10.1371\/journal.pcbi.1007664","article-title":"Tximeta: Reference sequence checksums for provenance identification in RNA-seq.","volume":"16","author":"MI Love","year":"2020","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1011676.ref015","doi-asserted-by":"crossref","first-page":"giz095","DOI":"10.1093\/gigascience\/giz095","article-title":"Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv.","volume":"8","author":"FZ Khan","year":"2019","journal-title":"Gigascience"},{"key":"pcbi.1011676.ref016","doi-asserted-by":"crossref","first-page":"599","DOI":"10.1016\/j.future.2011.08.004","article-title":"Why linked data is not enough for scientists.","volume":"29","author":"S Bechhofer","year":"2013","journal-title":"Future Gener Comput Syst"},{"key":"pcbi.1011676.ref017","doi-asserted-by":"crossref","first-page":"852","DOI":"10.1038\/s41587-019-0209-9","article-title":"Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2","volume":"37","author":"E Bolyen","year":"2019","journal-title":"Nat Biotechnol"},{"key":"pcbi.1011676.ref018","article-title":"PepSIRF + QIIME 2: software tools for automated, reproducible analysis of highly-multiplexed serology data.","author":"AM Brown","year":"2022","journal-title":"arXiv [q-bio.QM]."},{"key":"pcbi.1011676.ref019","doi-asserted-by":"crossref","first-page":"657","DOI":"10.12688\/f1000research.24751.1","article-title":"Reproducibly sampling SARS-CoV-2 genomes across time, geography, and viral diversity.","volume":"9","author":"E Bolyen","year":"2020","journal-title":"F1000Res"},{"key":"pcbi.1011676.ref020","author":"Python Software Foundation","year":"2001","journal-title":"Python Language Reference. Python Software Foundation"},{"key":"pcbi.1011676.ref021","doi-asserted-by":"crossref","first-page":"11","DOI":"10.25080\/TCWV9851","volume-title":"Proceedings of the 7th Python in Science Conference.","author":"AA Hagberg","year":"2008"},{"key":"pcbi.1011676.ref022","article-title":"community. PyYAML","author":"YAML Simonov K","year":"2006","journal-title":"The YAML Project"},{"key":"pcbi.1011676.ref023","unstructured":"Boulogne F, Mangin O, Verney L, Al E. BibTexParser. sciunto-org; Available from: https:\/\/bibtexparser.readthedocs.io\/en\/master\/."},{"key":"pcbi.1011676.ref024","author":"Pallets","year":"2014","journal-title":"Click. Pallets"},{"key":"pcbi.1011676.ref025","doi-asserted-by":"crossref","first-page":"319","DOI":"10.2307\/249008","article-title":"Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology.","volume":"13","author":"FD Davis","year":"1989","journal-title":"Miss Q."},{"key":"pcbi.1011676.ref026","article-title":"Improving In Silico Scientific Reproducibility With Provenance Replay Software.","author":"CR Keefe","year":"2022","journal-title":"Master of Science, Northern Arizona University."},{"key":"pcbi.1011676.ref027","author":"EM Borsom","year":"2022","journal-title":"Predicting neurodegenerative disease using pre-pathology gut microbiota composition: a longitudinal study in mice modeling Alzheimer\u2019s disease pathologies"},{"key":"pcbi.1011676.ref028","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1186\/s40168-023-01590-2","article-title":"Oligofructose improves small intestinal lipid-sensing mechanisms via alterations to the small intestinal microbiota.","volume":"11","author":"SN Weninger","year":"2023","journal-title":"Microbiome"},{"key":"pcbi.1011676.ref029","doi-asserted-by":"crossref","first-page":"796","DOI":"10.1038\/s41592-018-0141-9","article-title":"Qiita: rapid, web-enabled microbiome meta-analysis.","volume":"15","author":"A Gonzalez","year":"2018","journal-title":"Nat Methods"},{"key":"pcbi.1011676.ref030","doi-asserted-by":"crossref","first-page":"5081","DOI":"10.1093\/bioinformatics\/btac639","article-title":"Reproducible acquisition, management and meta-analysis of nucleotide sequence (meta)data using q2-fondue.","volume":"38","author":"M Ziemski","year":"2022","journal-title":"Bioinformatics"},{"key":"pcbi.1011676.ref031","doi-asserted-by":"crossref","first-page":"W537","DOI":"10.1093\/nar\/gky379","article-title":"The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update","volume":"46","author":"E Afgan","year":"2018","journal-title":"Nucleic Acids Res"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1011676","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2023,12,7]],"date-time":"2023-12-07T00:00:00Z","timestamp":1701907200000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011676","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,3]],"date-time":"2024-11-03T23:33:00Z","timestamp":1730676780000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011676"}},"subtitle":[],"editor":[{"given":"Jan-Ulrich","family":"Kreft","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,11,27]]},"references-count":31,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2023,11,27]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1011676","relation":{"new_version":[{"id-type":"doi","id":"10.1371\/journal.pcbi.1011676","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,27]]}}}