{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T04:42:46Z","timestamp":1780548166797,"version":"3.54.1"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2010,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Small molecules are of increasing interest for bioinformatics in areas such as metabolomics and drug discovery. The recent release of large open access chemistry databases generates a demand for flexible tools to process them and discover new knowledge. To freely support open science based on these data resources, it is desirable for the processing tools to be open source and available for everyone.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Here we describe a novel combination of the workflow engine Taverna and the cheminformatics library Chemistry Development Kit (CDK) resulting in a open source workflow solution for cheminformatics. We have implemented more than 160 different workers to handle specific cheminformatics tasks. We describe the applications of CDK-Taverna in various usage scenarios.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>The combination of the workflow engine Taverna and the Chemistry Development Kit provides the first open source cheminformatics workflow solution for the biosciences. With the Taverna-community working towards a more powerful workflow engine and a more user-friendly user interface, CDK-Taverna has the potential to become a free alternative to existing proprietary workflow tools.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1471-2105-11-159","type":"journal-article","created":{"date-parts":[[2010,3,30]],"date-time":"2010-03-30T06:14:14Z","timestamp":1269929654000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":57,"title":["CDK-Taverna: an open workflow environment for cheminformatics"],"prefix":"10.1186","volume":"11","author":[{"given":"Thomas","family":"Kuhn","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Egon L","family":"Willighagen","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Achim","family":"Zielesny","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Christoph","family":"Steinbeck","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2010,3,29]]},"reference":[{"key":"3616_CR1","unstructured":"The PubChem Project[http:\/\/pubchem.ncbi.nlm.nih.gov\/]"},{"key":"3616_CR2","doi-asserted-by":"publisher","first-page":"177","DOI":"10.1021\/ci049714+","volume":"45","author":"J Irwin","year":"2005","unstructured":"Irwin J, Shoichet B: ZINC - A Free Database of Commercially Available Compounds for Virtual Screening. Journal of Chemical Information and Modeling 2005, 45: 177\u2013182. 10.1021\/ci049714+","journal-title":"Journal of Chemical Information and Modeling"},{"key":"3616_CR3","unstructured":"The ChEMBL Group[http:\/\/www.ebi.ac.uk\/chembl]"},{"issue":"3","key":"3616_CR4","first-page":"393","volume":"11","author":"AJ Williams","year":"2008","unstructured":"Williams AJ: Public chemical compound databases. Current opinion in drug discovery & development 2008, 11(3):393\u2013404.","journal-title":"Current opinion in drug discovery & development"},{"issue":"3","key":"3616_CR5","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1007\/s11030-006-9041-5","volume":"10","author":"M Hassan","year":"2006","unstructured":"Hassan M, Brown RD, Varma-O'brien S, Rogers D: Cheminformatics analysis and learning in a data pipelining environment. Molecular diversity 2006, 10(3):283\u2013299. 10.1007\/s11030-006-9041-5","journal-title":"Molecular diversity"},{"issue":"3","key":"3616_CR6","first-page":"381","volume":"11","author":"J Shon","year":"2008","unstructured":"Shon J, Ohkawa H, Hammer J: Scientific workflows as productivity tools for drug discovery. Current opinion in drug discovery & development 2008, 11(3):381\u2013388.","journal-title":"Current opinion in drug discovery & development"},{"key":"3616_CR7","unstructured":"Pipeline Pilot data analysis and reporting platform[http:\/\/accelrys.com\/products\/scitegic\/]"},{"key":"3616_CR8","unstructured":"Inforsense Platform[http:\/\/www.inforsense.com\/products\/core_technology\/inforsense_platform\/]"},{"key":"3616_CR9","unstructured":"KNIME Konstanz Information Miner[http:\/\/www.knime.org\/]"},{"issue":"5-6","key":"3616_CR10","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1016\/j.compbiolchem.2007.08.009","volume":"31","author":"A Tiwari","year":"2007","unstructured":"Tiwari A, Sekhar AKT: Workflow based framework for life science informatics. Computational Biology and Chemistry 2007, 31(5\u20136):305\u2013319. 10.1016\/j.compbiolchem.2007.08.009","journal-title":"Computational Biology and Chemistry"},{"issue":"17","key":"3616_CR11","doi-asserted-by":"publisher","first-page":"2111","DOI":"10.2174\/138161206777585274","volume":"12","author":"C Steinbeck","year":"2006","unstructured":"Steinbeck C, Hoppe C, Kuhn S, Guha R, Willighagen EL: Recent Developments of The Chemistry Development Kit (CDK) - An Open-Source Java Library for Chemo- and Bioinformatics. Current Pharmaceutical Design 2006, 12(17):2111\u20132120.","journal-title":"Current Pharmaceutical Design"},{"issue":"2","key":"3616_CR12","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1021\/ci025584y","volume":"43","author":"C Steinbeck","year":"2003","unstructured":"Steinbeck C, Han YQ, Kuhn S, Horlacher O, Luttmann E, Willighagen E: The Chemistry Development Kit (CDK): An open-source Java library for chemo- and bioinformatics. Journal of Chemical Information and Computer Sciences 2003, 43(2):493\u2013500.","journal-title":"Journal of Chemical Information and Computer Sciences"},{"key":"3616_CR13","unstructured":"CDK-Taverna fully recognized[http:\/\/chem-bla-ics.blogspot.com\/2005\/10\/cdk-taverna-fully-recognized.html]"},{"key":"3616_CR14","unstructured":"CDK-Taverna Release on 2005\u201310\u201318[http:\/\/sourceforge.net\/projects\/cdk\/files\/CDK-Taverna\/20051018\/]"},{"issue":"17","key":"3616_CR15","doi-asserted-by":"publisher","first-page":"3045","DOI":"10.1093\/bioinformatics\/bth361","volume":"20","author":"T Oinn","year":"2004","unstructured":"Oinn T, Addis M, Ferris J, Marvin D, Senger M, Greenwood M, Carver T, Glover K, Pocock MR, Wipat A, Li P: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 2004, 20(17):3045\u20133054. 10.1093\/bioinformatics\/bth361","journal-title":"Bioinformatics"},{"key":"3616_CR16","unstructured":"The Open Source Definition[http:\/\/www.opensource.org\/docs\/osd]"},{"key":"3616_CR17","doi-asserted-by":"crossref","unstructured":"Spjuth O, Helmus T, Willighagen EL, Kuhn S, Eklund M, Steinbeck C, Wikberg JE: Bioclipse: An open rich client workbench for chemo- and bioinformatics. BMC Bioinformatics 2007., 8(59):","DOI":"10.1186\/1471-2105-8-59"},{"key":"3616_CR18","unstructured":"pgchem::tigress: chemoinformatics extension to the PostgreSQL[http:\/\/pgfoundry.org\/projects\/pgchem\/]"},{"key":"3616_CR19","unstructured":"Apache Maven[http:\/\/maven.apache.org\/]"},{"key":"3616_CR20","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-84628-757-2","volume-title":"Workflows for e-Science: Scientific Workflows for Grids","author":"IJ Taylor","year":"2007","unstructured":"Taylor IJ, Deelman E, Gannon DB, Shields M: Workflows for e-Science: Scientific Workflows for Grids. London, Springer; 2007."},{"key":"3616_CR21","unstructured":"W3C Web Service Definition Language (WSDL)[http:\/\/www.w3.org\/TR\/wsdl]"},{"key":"3616_CR22","unstructured":"W3C SOAP Specifications[http:\/\/www.w3.org\/TR\/soap\/]"},{"key":"3616_CR23","volume-title":"PhD thesis","author":"T Kuhn","year":"2009","unstructured":"Kuhn T: Open Source Workflow Engine for Cheminformatics: From Data Curation to Data Analysis. PhD thesis. University of Cologne; 2009."},{"key":"3616_CR24","unstructured":"myExperiment - Tags - Workflows only - cdk-taverna[http:\/\/www.myexperiment.org\/tags\/914?type=workflows]"},{"key":"3616_CR25","unstructured":"PostgreSQL Database[http:\/\/www.postgresql.org\/]"},{"key":"3616_CR26","unstructured":"GiST Support for PostgreSQL[http:\/\/www.sai.msu.su\/~megera\/postgres\/gist\/]"},{"key":"3616_CR27","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1021\/ci00057a005","volume":"28","author":"D Weininger","year":"1988","unstructured":"Weininger D: SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences 1988, 28: 31\u201336.","journal-title":"Journal of Chemical Information and Computer Sciences"},{"issue":"3","key":"3616_CR28","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1021\/ci00007a012","volume":"32","author":"A Dalby","year":"1992","unstructured":"Dalby A, Nourse JG, Hounshell WD, Gushurst AKI, Grier DL, Leland BA, Laufer J: Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited. Journal of Chemical Information and Computer Sciences 1992, 32(3):244\u2013255.","journal-title":"Journal of Chemical Information and Computer Sciences"},{"issue":"6","key":"3616_CR29","doi-asserted-by":"crossref","first-page":"928","DOI":"10.1021\/ci990052b","volume":"39","author":"P Murray-Rust","year":"1999","unstructured":"Murray-Rust P, Rzepa H: Chemical Markup, XML, and the Worldwide Web. 1. Basic Principles. Journal of Chemical Information and Computer Sciences 1999, 39(6):928\u2013942.","journal-title":"Journal of Chemical Information and Computer Sciences"},{"key":"3616_CR30","first-page":"4+","volume":"4","author":"EL Willighagen","year":"2001","unstructured":"Willighagen EL: Processing CML conventions in Java. Internet Journal of Chemistry 2001, 4: 4+.","journal-title":"Internet Journal of Chemistry"},{"issue":"6","key":"3616_CR31","doi-asserted-by":"publisher","first-page":"2015","DOI":"10.1021\/ci600531a","volume":"47","author":"S Kuhn","year":"2007","unstructured":"Kuhn S, Helmus T, Lancashire R, Murray-Rust P, Rzepa H, Steinbeck C, Willighagen E: Chemical Markup, XML, and the World Wide Web. 7. CMLSpect, an XML Vocabulary for Spectral Data. Journal of Chemical Information and Modeling 2007, 47(6):2015\u20132034. 10.1021\/ci600531a","journal-title":"Journal of Chemical Information and Modeling"},{"key":"3616_CR32","unstructured":"The database of chemical entities of biological Interest (ChEBI)[http:\/\/www.ebi.ac.uk\/chebi]"},{"key":"3616_CR33","doi-asserted-by":"publisher","first-page":"344","DOI":"10.1093\/nar\/gkm791","volume":"36","author":"K Degtyarenko","year":"2008","unstructured":"Degtyarenko K, de Matos P, Ennis M, Hastings J, Zbinden M, McNaught A, Alcantara R, Darsow M, Guedj M, Ashburner M: ChEBI: a database and ontology for chemical entities of biological interest. Nucl Acids Res 2008, 36: 344\u2013350. 10.1093\/nar\/gkm791","journal-title":"Nucl Acids Res"},{"issue":"4","key":"3616_CR34","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1016\/0893-6080(91)90045-7","volume":"4","author":"G Carpenter","year":"1991","unstructured":"Carpenter G, Grossberg S, Rosen D: ART 2-A: an adaptive resonance algorithm for rapid categorylearning and recognition. Neural networks 1991, 4(4):493\u2013504. 10.1016\/0893-6080(91)90045-7","journal-title":"Neural networks"},{"key":"3616_CR35","unstructured":"myExperiment - Topological Substructure Workflow[http:\/\/www.myexperiment.org\/workflows\/557\/]"},{"key":"3616_CR36","unstructured":"myExperiment - Substructure Search on Database Workflow[http:\/\/www.myexperiment.org\/workflows\/555\/]"},{"key":"3616_CR37","unstructured":"The CDK-Taverna Blog[http:\/\/cdktaverna.wordpress.com\/2008\/09\/07\/time-evaluation-for-calculating-molecular-descriptors-using-the-cdk\/]"},{"key":"3616_CR38","unstructured":"myExperiment - Calculation of molecular descriptors for molecules loaded from database[http:\/\/www.myexperiment.org\/workflows\/563\/]"},{"key":"3616_CR39","unstructured":"myExperiment - Reaction Enumeration Workflow[http:\/\/www.myexperiment.org\/workflows\/567\/]"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-11-159.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,25]],"date-time":"2024-03-25T11:47:55Z","timestamp":1711367275000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-11-159"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,3,29]]},"references-count":39,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2010,12]]}},"alternative-id":["3616"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-11-159","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,3,29]]},"assertion":[{"value":"7 September 2009","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 March 2010","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 March 2010","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"159"}}