{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,15]],"date-time":"2025-03-15T04:12:05Z","timestamp":1742011925525,"version":"3.38.0"},"reference-count":40,"publisher":"China Science Publishing & Media Ltd.","issue":"2","license":[{"start":{"date-parts":[[2022,3,7]],"date-time":"2022-03-07T00:00:00Z","timestamp":1646611200000},"content-version":"vor","delay-in-days":65,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We introduce the concept of Canonical Workflow Building Blocks (CWBB), a methodology of describing and wrapping computational tools, in order for them to be utilised in a reproducible manner from multiple workflow languages and execution platforms. The concept is implemented and demonstrated with the BioExcel Building Blocks library (BioBB), a collection of tool wrappers in the field of computational biomolecular simulation. Interoperability across different workflow languages is showcased through a protein Molecular Dynamics setup transversal workflow, built using this library and run with 5 different Workflow Manager Systems (WfMS). We argue such practice is a necessary requirement for FAIR Computational Workflows and an element of Canonical Workflow Frameworks for Research (CWFR) in order to improve widespread adoption and reuse of computational methods across workflow language barriers.<\/jats:p>","DOI":"10.1162\/dint_a_00135","type":"journal-article","created":{"date-parts":[[2022,3,7]],"date-time":"2022-03-07T18:07:09Z","timestamp":1646676429000},"page":"342-357","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":4,"title":["Making Canonical Workflow Building Blocks Interoperable across Workflow Languages"],"prefix":"10.3724","volume":"4","author":[{"given":"Stian","family":"Soiland-Reyes","sequence":"first","affiliation":[{"name":"Department of Computer Science, The University of Manchester, Manchester M13 9PL, UK"},{"name":"Informatics Institute, University of Amsterdam, Amsterdam 1098 XH, The Netherlands"}]},{"given":"Gen\u00eds","family":"Bayarri","sequence":"additional","affiliation":[{"name":"Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona 08028, Spain"}]},{"given":"Pau","family":"Andrio","sequence":"additional","affiliation":[{"name":"The Spanish National Bioinformatics Institute (INB), Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain"}]},{"given":"Robin","family":"Long","sequence":"additional","affiliation":[{"name":"Data Science Institute, Lancaster University, Lancaster, Lancashire LA1 4YW, UK"},{"name":"Research IT, IT Services, The University of Manchester, Manchester M13 9PL, UK"}]},{"given":"Douglas","family":"Lowe","sequence":"additional","affiliation":[{"name":"Research IT, IT Services, The University of Manchester, Manchester M13 9PL, UK"}]},{"given":"Ania","family":"Niewielska","sequence":"additional","affiliation":[{"name":"European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire CB10 1SD, UK"}]},{"given":"Adam","family":"Hospital","sequence":"additional","affiliation":[{"name":"Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona 08028, Spain"}]},{"given":"Paul","family":"Groth","sequence":"additional","affiliation":[{"name":"Informatics Institute, University of Amsterdam, Amsterdam 1098 XH, The Netherlands"}]}],"member":"2026","published-online":{"date-parts":[[2022,4,1]]},"reference":[{"issue":"6317","key":"2022052014404931800_ref1","doi-asserted-by":"crossref","first-page":"1240","DOI":"10.1126\/science.aah6168","article-title":"Enhancing reproducibility for computational methods","volume":"354","author":"Stodden","year":"2016","journal-title":"Science"},{"issue":"9","key":"2022052014404931800_ref2","doi-asserted-by":"crossref","first-page":"1003","DOI":"10.1016\/j.patter.2021.100322","article-title":"The role of metadata in reproducible computational research","volume":"2","author":"Leipzig","year":"2021","journal-title":"Patterns"},{"volume-title":"A fresh look at FAIR for research software","year":"2021","author":"Katz","key":"2022052014404931800_ref3"},{"key":"2022052014404931800_ref4","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1007\/s41019-017-0050-4","article-title":"Robust cross-platform workflows: How technical and scientific communities collaborate to develop, test and share best practices for data analysis","volume":"2","author":"M\u00f6ller","year":"2017","journal-title":"Data Science and Engineering"},{"key":"2022052014404931800_ref5","doi-asserted-by":"crossref","first-page":"284","DOI":"10.1016\/j.future.2017.01.012","article-title":"Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities","volume":"75","author":"Cohen-Boulakia","year":"2017","journal-title":"Future Generation Computer Systems"},{"issue":"6","key":"2022052014404931800_ref6","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1016\/j.cels.2018.03.014","article-title":"Practical computational reproducibility in the life sciences","volume":"6","author":"Gr\u00fcning","year":"2018","journal-title":"Cell Systems"},{"issue":"1","key":"2022052014404931800_ref7","doi-asserted-by":"crossref","first-page":"37","DOI":"10.3233\/DS-190026","article-title":"Towards FAIR principles for research software","volume":"3","author":"Lamprecht","year":"2020","journal-title":"Data Science"},{"issue":"2","key":"2022052014404931800_ref8","doi-asserted-by":"crossref","first-page":"21","DOI":"10.3390\/publications8020021","article-title":"FAIR digital objects for science: From data pieces to actionable knowledge units","volume":"8","author":"De Smedt","year":"2020","journal-title":"Publications"},{"issue":"1","key":"2022052014404931800_ref9","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1162\/dint_a_00033","article-title":"(2020): FAIR Computational Workflows","volume":"2","author":"Goble","year":"2020","journal-title":"Data Intelligence"},{"key":"2022052014404931800_ref10","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1038\/s41597-019-0177-4","article-title":"BioExcel building blocks, a software library for interoperable biomolecular simulation workflows","volume":"6","author":"Andrio","year":"2019","journal-title":"Scientific Data"},{"issue":"10","key":"2022052014404931800_ref11","doi-asserted-by":"crossref","first-page":"1325","DOI":"10.1093\/bioinformatics\/btt113","article-title":"EDAM: An ontology of bioinformatics operations, types of data and identifiers, topics and formats","volume":"29","author":"Ison","year":"2013","journal-title":"Bioinformatics"},{"volume-title":"BioExcel-2 Deliverable 2.3\u2014First release of demonstration workflows (2020)","author":"Hospital","key":"2022052014404931800_ref12"},{"volume-title":"(2016): Jupyter notebooks\u2014a publishing format for reproducible computational workflows","author":"Kluyver","key":"2022052014404931800_ref13"},{"issue":"2","key":"2022052014404931800_ref14","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1109\/MCSE.2021.3052101","article-title":"Using Jupyter for reproducible scientific workflows","volume":"23","author":"Beg","year":"2021","journal-title":"Computing in Science & Engineering"},{"key":"2022052014404931800_ref15","first-page":"113","volume-title":"Binder 2.0\u2014Reproducible, interactive, sharable environments for science at scale","author":"Jupyter Project","year":"2018"},{"key":"2022052014404931800_ref16","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1038\/s41592-018-0046-7","article-title":"Bioconda: Sustainable and comprehensive software distribution for the life sciences","volume":"15","author":"Gr\u00fcning","year":"2018","journal-title":"Nature Methods"},{"volume-title":"BioExcel-2 Deliverable 2.5\u2014Provision of a workflow environment at BioExcel portal","author":"Niewielska","key":"2022052014404931800_ref17"},{"issue":"W1","key":"2022052014404931800_ref18","doi-asserted-by":"crossref","first-page":"W537","DOI":"10.1093\/nar\/gky379","article-title":"The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update","volume":"46","author":"Afgan","year":"2018","journal-title":"Nucleic Acids Research"},{"volume-title":"Methods included: Standardizing computational reuse and portability with the common workflow language","author":"Crusoe","key":"2022052014404931800_ref19","doi-asserted-by":"crossref","DOI":"10.1145\/3486897"},{"issue":"1","key":"2022052014404931800_ref20","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1177\/1094342015594678","article-title":"PyCOMPSs: Parallel computational workflows in Python","volume":"31","author":"Tejedor","year":"2017","journal-title":"The International Journal of High Performance Computing Applications"},{"key":"2022052014404931800_ref21","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1016\/j.jbiotec.2017.07.028","article-title":"KNIME for reproducible cross-domain analysis of life science data","volume":"261","author":"Fillbrunn","year":"2017","journal-title":"Journal of Biotechnology"},{"volume-title":"Protein MD setup tutorial using BioExcel building blocks (biobb) in Galaxy","author":"Lowe","key":"2022052014404931800_ref22"},{"volume-title":"Protein MD setup tutorial using BioExcel building blocks (biobb) in KNIME","author":"Hospital","key":"2022052014404931800_ref23"},{"volume-title":"Protein MD setup tutorial using BioExcel building blocks (biobb) in CWL","author":"Bayarri","key":"2022052014404931800_ref24"},{"volume-title":"(2021): Protein MD setup tutorial using BioExcel building blocks (biobb) in Jupyter Notebook","author":"Bayarri","key":"2022052014404931800_ref25"},{"volume-title":"Protein MD setup HPC tutorial using BioExcel building blocks (biobb) in PyCOMPSs","author":"Hospital","key":"2022052014404931800_ref26"},{"issue":"3","key":"2022052014404931800_ref27","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1093\/bib\/bbn003","article-title":"Interoperability with Moby 1.0\u2014It's better than sharing your toothbrush!","volume":"9","author":"The BioMoby Consortium","year":"2008","journal-title":"Briefings in Bioinformatics"},{"issue":"15","key":"2022052014404931800_ref28","doi-asserted-by":"crossref","first-page":"1910","DOI":"10.1093\/bioinformatics\/btl272","article-title":"caGrid: Design and implementation of the core architecture of the cancer biomedical informatics grid","volume":"22","author":"Saltz","year":"2006","journal-title":"Bioinformatics"},{"key":"2022052014404931800_ref29","first-page":"47","volume-title":"A new approach for publishing workflows","author":"Garijo","year":"2011"},{"key":"2022052014404931800_ref30","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1016\/j.future.2013.09.018","article-title":"Common motifs in scientific workflows: An empirical analysis","volume":"36","author":"Garijo","year":"2014","journal-title":"Future generation computer systems"},{"issue":"4","key":"2022052014404931800_ref31","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1111\/ecog.01552","article-title":"ENM components: A new set of Web service \u2013 based workflow components for ecological niche modelling","volume":"39","author":"De Giovanni","year":"2016","journal-title":"Ecography"},{"key":"2022052014404931800_ref32","doi-asserted-by":"crossref","DOI":"10.1186\/gb4161","article-title":"Dissemination of scientific software with Galaxy ToolShed","volume":"15","author":"Blankenberg","year":"2014","journal-title":"Genome Biology"},{"key":"2022052014404931800_ref33","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1038\/nbt.3772","article-title":"Toil enables reproducible, open source, big biomedical data analyses","volume":"35","author":"Vivian","year":"2017","journal-title":"Nature Biotechnology"},{"volume-title":"Packaging research artefacts with RO-Crate","author":"Soiland-Reyes","key":"2022052014404931800_ref34","doi-asserted-by":"crossref","DOI":"10.3233\/DS-210053"},{"issue":"1","key":"2022052014404931800_ref35","doi-asserted-by":"crossref","first-page":"giaa157","DOI":"10.1093\/gigascience\/giaa157","article-title":"biotoolsSchema: A formalized schema for bioinformatics software description","volume":"10","author":"Ison","year":"2021","journal-title":"GigaScience,"},{"volume-title":"CWFR position paper","author":"The CWFR Group","key":"2022052014404931800_ref36"},{"first-page":"e1009823","volume-title":"10 simple rules for making a software tool workflow-ready","author":"Brack","key":"2022052014404931800_ref37"},{"issue":"6","key":"2022052014404931800_ref38","doi-asserted-by":"crossref","first-page":"e2001414","DOI":"10.1371\/journal.pbio.2001414","article-title":"Identifiers for the 21st century: How to design, provision, and reuse identifiers to maximize utility and impact of life science data","volume":"15","author":"McMurry","year":"2017","journal-title":"PLOS Biology"},{"volume-title":"A community roadmap for scientific workflows research and development","year":"2021","author":"Ferreira da Silva","key":"2022052014404931800_ref39"},{"issue":"5","key":"2022052014404931800_ref40","doi-asserted-by":"crossref","first-page":"e1007808","DOI":"10.1371\/journal.pcbi.1007808","article-title":"Ten simple rules to run a successful BioHackathon","volume":"16","author":"Garcia","year":"2020","journal-title":"PLOS Computational Biology"}],"container-title":["Data Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/dint\/article-pdf\/4\/2\/342\/2023530\/dint_a_00135.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/dint\/article-pdf\/4\/2\/342\/2023530\/dint_a_00135.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,14]],"date-time":"2025-03-14T07:43:34Z","timestamp":1741938214000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.sciengine.com\/doi\/10.1162\/dint_a_00135"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022]]},"references-count":40,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,4,1]]}},"URL":"https:\/\/doi.org\/10.1162\/dint_a_00135","relation":{},"ISSN":["2641-435X"],"issn-type":[{"type":"electronic","value":"2641-435X"}],"subject":[],"published-other":{"date-parts":[[2022]]},"published":{"date-parts":[[2022]]}}}