Cargando…
New developments on the cheminformatics open workflow environment CDK-Taverna
BACKGROUND: The computational processing and analysis of small molecules is at heart of cheminformatics and structural bioinformatics and their application in e.g. metabolomics or drug discovery. Pipelining or workflow tools allow for the Lego™-like, graphical assembly of I/O modules and algorithms...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3292505/ https://www.ncbi.nlm.nih.gov/pubmed/22166170 http://dx.doi.org/10.1186/1758-2946-3-54 |
_version_ | 1782225286720389120 |
---|---|
author | Truszkowski, Andreas Jayaseelan, Kalai Vanii Neumann, Stefan Willighagen, Egon L Zielesny, Achim Steinbeck, Christoph |
author_facet | Truszkowski, Andreas Jayaseelan, Kalai Vanii Neumann, Stefan Willighagen, Egon L Zielesny, Achim Steinbeck, Christoph |
author_sort | Truszkowski, Andreas |
collection | PubMed |
description | BACKGROUND: The computational processing and analysis of small molecules is at heart of cheminformatics and structural bioinformatics and their application in e.g. metabolomics or drug discovery. Pipelining or workflow tools allow for the Lego™-like, graphical assembly of I/O modules and algorithms into a complex workflow which can be easily deployed, modified and tested without the hassle of implementing it into a monolithic application. The CDK-Taverna project aims at building a free open-source cheminformatics pipelining solution through combination of different open-source projects such as Taverna, the Chemistry Development Kit (CDK) or the Waikato Environment for Knowledge Analysis (WEKA). A first integrated version 1.0 of CDK-Taverna was recently released to the public. RESULTS: The CDK-Taverna project was migrated to the most up-to-date versions of its foundational software libraries with a complete re-engineering of its worker's architecture (version 2.0). 64-bit computing and multi-core usage by paralleled threads are now supported to allow for fast in-memory processing and analysis of large sets of molecules. Earlier deficiencies like workarounds for iterative data reading are removed. The combinatorial chemistry related reaction enumeration features are considerably enhanced. Additional functionality for calculating a natural product likeness score for small molecules is implemented to identify possible drug candidates. Finally the data analysis capabilities are extended with new workers that provide access to the open-source WEKA library for clustering and machine learning as well as training and test set partitioning. The new features are outlined with usage scenarios. CONCLUSIONS: CDK-Taverna 2.0 as an open-source cheminformatics workflow solution matured to become a freely available and increasingly powerful tool for the biosciences. The combination of the new CDK-Taverna worker family with the already available workflows developed by a lively Taverna community and published on myexperiment.org enables molecular scientists to quickly calculate, process and analyse molecular data as typically found in e.g. today's systems biology scenarios. |
format | Online Article Text |
id | pubmed-3292505 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32925052012-03-05 New developments on the cheminformatics open workflow environment CDK-Taverna Truszkowski, Andreas Jayaseelan, Kalai Vanii Neumann, Stefan Willighagen, Egon L Zielesny, Achim Steinbeck, Christoph J Cheminform Software BACKGROUND: The computational processing and analysis of small molecules is at heart of cheminformatics and structural bioinformatics and their application in e.g. metabolomics or drug discovery. Pipelining or workflow tools allow for the Lego™-like, graphical assembly of I/O modules and algorithms into a complex workflow which can be easily deployed, modified and tested without the hassle of implementing it into a monolithic application. The CDK-Taverna project aims at building a free open-source cheminformatics pipelining solution through combination of different open-source projects such as Taverna, the Chemistry Development Kit (CDK) or the Waikato Environment for Knowledge Analysis (WEKA). A first integrated version 1.0 of CDK-Taverna was recently released to the public. RESULTS: The CDK-Taverna project was migrated to the most up-to-date versions of its foundational software libraries with a complete re-engineering of its worker's architecture (version 2.0). 64-bit computing and multi-core usage by paralleled threads are now supported to allow for fast in-memory processing and analysis of large sets of molecules. Earlier deficiencies like workarounds for iterative data reading are removed. The combinatorial chemistry related reaction enumeration features are considerably enhanced. Additional functionality for calculating a natural product likeness score for small molecules is implemented to identify possible drug candidates. Finally the data analysis capabilities are extended with new workers that provide access to the open-source WEKA library for clustering and machine learning as well as training and test set partitioning. The new features are outlined with usage scenarios. CONCLUSIONS: CDK-Taverna 2.0 as an open-source cheminformatics workflow solution matured to become a freely available and increasingly powerful tool for the biosciences. The combination of the new CDK-Taverna worker family with the already available workflows developed by a lively Taverna community and published on myexperiment.org enables molecular scientists to quickly calculate, process and analyse molecular data as typically found in e.g. today's systems biology scenarios. BioMed Central 2011-12-13 /pmc/articles/PMC3292505/ /pubmed/22166170 http://dx.doi.org/10.1186/1758-2946-3-54 Text en Copyright ©2011 Truszkowski et al; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Truszkowski, Andreas Jayaseelan, Kalai Vanii Neumann, Stefan Willighagen, Egon L Zielesny, Achim Steinbeck, Christoph New developments on the cheminformatics open workflow environment CDK-Taverna |
title | New developments on the cheminformatics open workflow environment CDK-Taverna |
title_full | New developments on the cheminformatics open workflow environment CDK-Taverna |
title_fullStr | New developments on the cheminformatics open workflow environment CDK-Taverna |
title_full_unstemmed | New developments on the cheminformatics open workflow environment CDK-Taverna |
title_short | New developments on the cheminformatics open workflow environment CDK-Taverna |
title_sort | new developments on the cheminformatics open workflow environment cdk-taverna |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3292505/ https://www.ncbi.nlm.nih.gov/pubmed/22166170 http://dx.doi.org/10.1186/1758-2946-3-54 |
work_keys_str_mv | AT truszkowskiandreas newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna AT jayaseelankalaivanii newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna AT neumannstefan newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna AT willighagenegonl newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna AT zielesnyachim newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna AT steinbeckchristoph newdevelopmentsonthecheminformaticsopenworkflowenvironmentcdktaverna |