Cargando…

Scalable and cost-effective NGS genotyping in the cloud

BACKGROUND: While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data...

Descripción completa

Detalles Bibliográficos
Autores principales: Souilmi, Yassine, Lancaster, Alex K., Jung, Jae-Yoon, Rizzo, Ettore, Hawkins, Jared B., Powles, Ryan, Amzazi, Saaïd, Ghazal, Hassan, Tonellato, Peter J., Wall, Dennis P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4608296/
https://www.ncbi.nlm.nih.gov/pubmed/26470712
http://dx.doi.org/10.1186/s12920-015-0134-9
_version_ 1782395646743937024
author Souilmi, Yassine
Lancaster, Alex K.
Jung, Jae-Yoon
Rizzo, Ettore
Hawkins, Jared B.
Powles, Ryan
Amzazi, Saaïd
Ghazal, Hassan
Tonellato, Peter J.
Wall, Dennis P.
author_facet Souilmi, Yassine
Lancaster, Alex K.
Jung, Jae-Yoon
Rizzo, Ettore
Hawkins, Jared B.
Powles, Ryan
Amzazi, Saaïd
Ghazal, Hassan
Tonellato, Peter J.
Wall, Dennis P.
author_sort Souilmi, Yassine
collection PubMed
description BACKGROUND: While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10’s of dollars. RESULTS: We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets. CONCLUSIONS: Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12920-015-0134-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4608296
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46082962015-10-17 Scalable and cost-effective NGS genotyping in the cloud Souilmi, Yassine Lancaster, Alex K. Jung, Jae-Yoon Rizzo, Ettore Hawkins, Jared B. Powles, Ryan Amzazi, Saaïd Ghazal, Hassan Tonellato, Peter J. Wall, Dennis P. BMC Med Genomics Technical Advance BACKGROUND: While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10’s of dollars. RESULTS: We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets. CONCLUSIONS: Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12920-015-0134-9) contains supplementary material, which is available to authorized users. BioMed Central 2015-10-15 /pmc/articles/PMC4608296/ /pubmed/26470712 http://dx.doi.org/10.1186/s12920-015-0134-9 Text en © Souilmi et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Technical Advance
Souilmi, Yassine
Lancaster, Alex K.
Jung, Jae-Yoon
Rizzo, Ettore
Hawkins, Jared B.
Powles, Ryan
Amzazi, Saaïd
Ghazal, Hassan
Tonellato, Peter J.
Wall, Dennis P.
Scalable and cost-effective NGS genotyping in the cloud
title Scalable and cost-effective NGS genotyping in the cloud
title_full Scalable and cost-effective NGS genotyping in the cloud
title_fullStr Scalable and cost-effective NGS genotyping in the cloud
title_full_unstemmed Scalable and cost-effective NGS genotyping in the cloud
title_short Scalable and cost-effective NGS genotyping in the cloud
title_sort scalable and cost-effective ngs genotyping in the cloud
topic Technical Advance
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4608296/
https://www.ncbi.nlm.nih.gov/pubmed/26470712
http://dx.doi.org/10.1186/s12920-015-0134-9
work_keys_str_mv AT souilmiyassine scalableandcosteffectivengsgenotypinginthecloud
AT lancasteralexk scalableandcosteffectivengsgenotypinginthecloud
AT jungjaeyoon scalableandcosteffectivengsgenotypinginthecloud
AT rizzoettore scalableandcosteffectivengsgenotypinginthecloud
AT hawkinsjaredb scalableandcosteffectivengsgenotypinginthecloud
AT powlesryan scalableandcosteffectivengsgenotypinginthecloud
AT amzazisaaid scalableandcosteffectivengsgenotypinginthecloud
AT ghazalhassan scalableandcosteffectivengsgenotypinginthecloud
AT tonellatopeterj scalableandcosteffectivengsgenotypinginthecloud
AT walldennisp scalableandcosteffectivengsgenotypinginthecloud