Cargando…

Initial steps towards a production platform for DNA sequence analysis on the grid

BACKGROUND: Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handl...

Descripción completa

Detalles Bibliográficos
Autores principales: Luyf, Angela CM, van Schaik, Barbera DC, de Vries, Michel, Baas, Frank, van Kampen, Antoine HC, Olabarriaga, Silvia D
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3018473/
https://www.ncbi.nlm.nih.gov/pubmed/21156038
http://dx.doi.org/10.1186/1471-2105-11-598
_version_ 1782196075796365312
author Luyf, Angela CM
van Schaik, Barbera DC
de Vries, Michel
Baas, Frank
van Kampen, Antoine HC
Olabarriaga, Silvia D
author_facet Luyf, Angela CM
van Schaik, Barbera DC
de Vries, Michel
Baas, Frank
van Kampen, Antoine HC
Olabarriaga, Silvia D
author_sort Luyf, Angela CM
collection PubMed
description BACKGROUND: Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. RESULTS: In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures. CONCLUSIONS: The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/
format Text
id pubmed-3018473
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30184732011-01-11 Initial steps towards a production platform for DNA sequence analysis on the grid Luyf, Angela CM van Schaik, Barbera DC de Vries, Michel Baas, Frank van Kampen, Antoine HC Olabarriaga, Silvia D BMC Bioinformatics Research Article BACKGROUND: Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. RESULTS: In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures. CONCLUSIONS: The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/ BioMed Central 2010-12-14 /pmc/articles/PMC3018473/ /pubmed/21156038 http://dx.doi.org/10.1186/1471-2105-11-598 Text en Copyright ©2010 Luyf et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Luyf, Angela CM
van Schaik, Barbera DC
de Vries, Michel
Baas, Frank
van Kampen, Antoine HC
Olabarriaga, Silvia D
Initial steps towards a production platform for DNA sequence analysis on the grid
title Initial steps towards a production platform for DNA sequence analysis on the grid
title_full Initial steps towards a production platform for DNA sequence analysis on the grid
title_fullStr Initial steps towards a production platform for DNA sequence analysis on the grid
title_full_unstemmed Initial steps towards a production platform for DNA sequence analysis on the grid
title_short Initial steps towards a production platform for DNA sequence analysis on the grid
title_sort initial steps towards a production platform for dna sequence analysis on the grid
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3018473/
https://www.ncbi.nlm.nih.gov/pubmed/21156038
http://dx.doi.org/10.1186/1471-2105-11-598
work_keys_str_mv AT luyfangelacm initialstepstowardsaproductionplatformfordnasequenceanalysisonthegrid
AT vanschaikbarberadc initialstepstowardsaproductionplatformfordnasequenceanalysisonthegrid
AT devriesmichel initialstepstowardsaproductionplatformfordnasequenceanalysisonthegrid
AT baasfrank initialstepstowardsaproductionplatformfordnasequenceanalysisonthegrid
AT vankampenantoinehc initialstepstowardsaproductionplatformfordnasequenceanalysisonthegrid
AT olabarriagasilviad initialstepstowardsaproductionplatformfordnasequenceanalysisonthegrid