Cargando…
cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud
Summary: One of the solutions proposed for addressing the challenge of the overwhelming abundance of genomic sequence and other biological data is the use of the Hadoop computing framework. Appropriate tools are needed to set up computational environments that facilitate research of novel bioinforma...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4708102/ https://www.ncbi.nlm.nih.gov/pubmed/26428290 http://dx.doi.org/10.1093/bioinformatics/btv553 |
_version_ | 1782409399000629248 |
---|---|
author | Hodor, Paul Chawla, Amandeep Clark, Andrew Neal, Lauren |
author_facet | Hodor, Paul Chawla, Amandeep Clark, Andrew Neal, Lauren |
author_sort | Hodor, Paul |
collection | PubMed |
description | Summary: One of the solutions proposed for addressing the challenge of the overwhelming abundance of genomic sequence and other biological data is the use of the Hadoop computing framework. Appropriate tools are needed to set up computational environments that facilitate research of novel bioinformatics methodology using Hadoop. Here, we present cl-dash, a complete starter kit for setting up such an environment. Configuring and deploying new Hadoop clusters can be done in minutes. Use of Amazon Web Services ensures no initial investment and minimal operation costs. Two sample bioinformatics applications help the researcher understand and learn the principles of implementing an algorithm using the MapReduce programming pattern. Availability and implementation: Source code is available at https://bitbucket.org/booz-allen-sci-comp-team/cl-dash.git. Contact: hodor_paul@bah.com |
format | Online Article Text |
id | pubmed-4708102 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-47081022016-01-12 cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud Hodor, Paul Chawla, Amandeep Clark, Andrew Neal, Lauren Bioinformatics Applications Notes Summary: One of the solutions proposed for addressing the challenge of the overwhelming abundance of genomic sequence and other biological data is the use of the Hadoop computing framework. Appropriate tools are needed to set up computational environments that facilitate research of novel bioinformatics methodology using Hadoop. Here, we present cl-dash, a complete starter kit for setting up such an environment. Configuring and deploying new Hadoop clusters can be done in minutes. Use of Amazon Web Services ensures no initial investment and minimal operation costs. Two sample bioinformatics applications help the researcher understand and learn the principles of implementing an algorithm using the MapReduce programming pattern. Availability and implementation: Source code is available at https://bitbucket.org/booz-allen-sci-comp-team/cl-dash.git. Contact: hodor_paul@bah.com Oxford University Press 2016-01-15 2015-10-01 /pmc/articles/PMC4708102/ /pubmed/26428290 http://dx.doi.org/10.1093/bioinformatics/btv553 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Notes Hodor, Paul Chawla, Amandeep Clark, Andrew Neal, Lauren cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud |
title | cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud |
title_full | cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud |
title_fullStr | cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud |
title_full_unstemmed | cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud |
title_short | cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud |
title_sort | cl-dash: rapid configuration and deployment of hadoop clusters for bioinformatics research in the cloud |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4708102/ https://www.ncbi.nlm.nih.gov/pubmed/26428290 http://dx.doi.org/10.1093/bioinformatics/btv553 |
work_keys_str_mv | AT hodorpaul cldashrapidconfigurationanddeploymentofhadoopclustersforbioinformaticsresearchinthecloud AT chawlaamandeep cldashrapidconfigurationanddeploymentofhadoopclustersforbioinformaticsresearchinthecloud AT clarkandrew cldashrapidconfigurationanddeploymentofhadoopclustersforbioinformaticsresearchinthecloud AT neallauren cldashrapidconfigurationanddeploymentofhadoopclustersforbioinformaticsresearchinthecloud |