Cargando…

Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud

BACKGROUND: Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomic...

Descripción completa

Detalles Bibliográficos
Autores principales: Afgan, Enis, Sloggett, Clare, Goonasekera, Nuwan, Makunin, Igor, Benson, Derek, Crowe, Mark, Gladman, Simon, Kowsar, Yousef, Pheasant, Michael, Horst, Ron, Lonie, Andrew
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4621043/
https://www.ncbi.nlm.nih.gov/pubmed/26501966
http://dx.doi.org/10.1371/journal.pone.0140829
_version_ 1782397386173186048
author Afgan, Enis
Sloggett, Clare
Goonasekera, Nuwan
Makunin, Igor
Benson, Derek
Crowe, Mark
Gladman, Simon
Kowsar, Yousef
Pheasant, Michael
Horst, Ron
Lonie, Andrew
author_facet Afgan, Enis
Sloggett, Clare
Goonasekera, Nuwan
Makunin, Igor
Benson, Derek
Crowe, Mark
Gladman, Simon
Kowsar, Yousef
Pheasant, Michael
Horst, Ron
Lonie, Andrew
author_sort Afgan, Enis
collection PubMed
description BACKGROUND: Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. RESULTS: We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. CONCLUSIONS: This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the value added to the research community through the suite of services and resources provided by our implementation.
format Online
Article
Text
id pubmed-4621043
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46210432015-10-29 Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud Afgan, Enis Sloggett, Clare Goonasekera, Nuwan Makunin, Igor Benson, Derek Crowe, Mark Gladman, Simon Kowsar, Yousef Pheasant, Michael Horst, Ron Lonie, Andrew PLoS One Research Article BACKGROUND: Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. RESULTS: We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. CONCLUSIONS: This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the value added to the research community through the suite of services and resources provided by our implementation. Public Library of Science 2015-10-26 /pmc/articles/PMC4621043/ /pubmed/26501966 http://dx.doi.org/10.1371/journal.pone.0140829 Text en © 2015 Afgan et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Afgan, Enis
Sloggett, Clare
Goonasekera, Nuwan
Makunin, Igor
Benson, Derek
Crowe, Mark
Gladman, Simon
Kowsar, Yousef
Pheasant, Michael
Horst, Ron
Lonie, Andrew
Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud
title Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud
title_full Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud
title_fullStr Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud
title_full_unstemmed Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud
title_short Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud
title_sort genomics virtual laboratory: a practical bioinformatics workbench for the cloud
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4621043/
https://www.ncbi.nlm.nih.gov/pubmed/26501966
http://dx.doi.org/10.1371/journal.pone.0140829
work_keys_str_mv AT afganenis genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud
AT sloggettclare genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud
AT goonasekeranuwan genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud
AT makuninigor genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud
AT bensonderek genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud
AT crowemark genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud
AT gladmansimon genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud
AT kowsaryousef genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud
AT pheasantmichael genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud
AT horstron genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud
AT lonieandrew genomicsvirtuallaboratoryapracticalbioinformaticsworkbenchforthecloud