Cargando…

MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants

BACKGROUND: Next Generation Genome sequencing techniques became affordable for massive sequencing efforts devoted to clinical characterization of human diseases. However, the cost of providing cloud-based data analysis of the mounting datasets remains a concerning bottleneck for providing cost-effec...

Descripción completa

Detalles Bibliográficos
Autores principales: Elshazly, Hatem, Souilmi, Yassine, Tonellato, Peter J., Wall, Dennis P., Abouelhoda, Mohamed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5248509/
https://www.ncbi.nlm.nih.gov/pubmed/28107819
http://dx.doi.org/10.1186/s12859-016-1454-2
_version_ 1782497282053111808
author Elshazly, Hatem
Souilmi, Yassine
Tonellato, Peter J.
Wall, Dennis P.
Abouelhoda, Mohamed
author_facet Elshazly, Hatem
Souilmi, Yassine
Tonellato, Peter J.
Wall, Dennis P.
Abouelhoda, Mohamed
author_sort Elshazly, Hatem
collection PubMed
description BACKGROUND: Next Generation Genome sequencing techniques became affordable for massive sequencing efforts devoted to clinical characterization of human diseases. However, the cost of providing cloud-based data analysis of the mounting datasets remains a concerning bottleneck for providing cost-effective clinical services. To address this computational problem, it is important to optimize the variant analysis workflow and the used analysis tools to reduce the overall computational processing time, and concomitantly reduce the processing cost. Furthermore, it is important to capitalize on the use of the recent development in the cloud computing market, which have witnessed more providers competing in terms of products and prices. RESULTS: In this paper, we present a new package called MC-GenomeKey (Multi-Cloud GenomeKey) that efficiently executes the variant analysis workflow for detecting and annotating mutations using cloud resources from different commercial cloud providers. Our package supports Amazon, Google, and Azure clouds, as well as, any other cloud platform based on OpenStack. Our package allows different scenarios of execution with different levels of sophistication, up to the one where a workflow can be executed using a cluster whose nodes come from different clouds. MC-GenomeKey also supports scenarios to exploit the spot instance model of Amazon in combination with the use of other cloud platforms to provide significant cost reduction. To the best of our knowledge, this is the first solution that optimizes the execution of the workflow using computational resources from different cloud providers. CONCLUSIONS: MC-GenomeKey provides an efficient multicloud based solution to detect and annotate mutations. The package can run in different commercial cloud platforms, which enables the user to seize the best offers. The package also provides a reliable means to make use of the low-cost spot instance model of Amazon, as it provides an efficient solution to the sudden termination of spot machines as a result of a sudden price increase. The package has a web-interface and it is available for free for academic use.
format Online
Article
Text
id pubmed-5248509
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52485092017-01-25 MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants Elshazly, Hatem Souilmi, Yassine Tonellato, Peter J. Wall, Dennis P. Abouelhoda, Mohamed BMC Bioinformatics Software BACKGROUND: Next Generation Genome sequencing techniques became affordable for massive sequencing efforts devoted to clinical characterization of human diseases. However, the cost of providing cloud-based data analysis of the mounting datasets remains a concerning bottleneck for providing cost-effective clinical services. To address this computational problem, it is important to optimize the variant analysis workflow and the used analysis tools to reduce the overall computational processing time, and concomitantly reduce the processing cost. Furthermore, it is important to capitalize on the use of the recent development in the cloud computing market, which have witnessed more providers competing in terms of products and prices. RESULTS: In this paper, we present a new package called MC-GenomeKey (Multi-Cloud GenomeKey) that efficiently executes the variant analysis workflow for detecting and annotating mutations using cloud resources from different commercial cloud providers. Our package supports Amazon, Google, and Azure clouds, as well as, any other cloud platform based on OpenStack. Our package allows different scenarios of execution with different levels of sophistication, up to the one where a workflow can be executed using a cluster whose nodes come from different clouds. MC-GenomeKey also supports scenarios to exploit the spot instance model of Amazon in combination with the use of other cloud platforms to provide significant cost reduction. To the best of our knowledge, this is the first solution that optimizes the execution of the workflow using computational resources from different cloud providers. CONCLUSIONS: MC-GenomeKey provides an efficient multicloud based solution to detect and annotate mutations. The package can run in different commercial cloud platforms, which enables the user to seize the best offers. The package also provides a reliable means to make use of the low-cost spot instance model of Amazon, as it provides an efficient solution to the sudden termination of spot machines as a result of a sudden price increase. The package has a web-interface and it is available for free for academic use. BioMed Central 2017-01-20 /pmc/articles/PMC5248509/ /pubmed/28107819 http://dx.doi.org/10.1186/s12859-016-1454-2 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Elshazly, Hatem
Souilmi, Yassine
Tonellato, Peter J.
Wall, Dennis P.
Abouelhoda, Mohamed
MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants
title MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants
title_full MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants
title_fullStr MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants
title_full_unstemmed MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants
title_short MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants
title_sort mc-genomekey: a multicloud system for the detection and annotation of genomic variants
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5248509/
https://www.ncbi.nlm.nih.gov/pubmed/28107819
http://dx.doi.org/10.1186/s12859-016-1454-2
work_keys_str_mv AT elshazlyhatem mcgenomekeyamulticloudsystemforthedetectionandannotationofgenomicvariants
AT souilmiyassine mcgenomekeyamulticloudsystemforthedetectionandannotationofgenomicvariants
AT tonellatopeterj mcgenomekeyamulticloudsystemforthedetectionandannotationofgenomicvariants
AT walldennisp mcgenomekeyamulticloudsystemforthedetectionandannotationofgenomicvariants
AT abouelhodamohamed mcgenomekeyamulticloudsystemforthedetectionandannotationofgenomicvariants