Cargando…

Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics

While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelizati...

Descripción completa

Detalles Bibliográficos
Autores principales: Kelly, Benjamin J, Fitch, James R, Hu, Yangqiu, Corsmeier, Donald J, Zhong, Huachun, Wetzel, Amy N, Nordquist, Russell D, Newsom, David L, White, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4333267/
https://www.ncbi.nlm.nih.gov/pubmed/25600152
http://dx.doi.org/10.1186/s13059-014-0577-x
_version_ 1782358005556183040
author Kelly, Benjamin J
Fitch, James R
Hu, Yangqiu
Corsmeier, Donald J
Zhong, Huachun
Wetzel, Amy N
Nordquist, Russell D
Newsom, David L
White, Peter
author_facet Kelly, Benjamin J
Fitch, James R
Hu, Yangqiu
Corsmeier, Donald J
Zhong, Huachun
Wetzel, Amy N
Nordquist, Russell D
Newsom, David L
White, Peter
author_sort Kelly, Benjamin J
collection PubMed
description While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0577-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4333267
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43332672015-02-20 Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics Kelly, Benjamin J Fitch, James R Hu, Yangqiu Corsmeier, Donald J Zhong, Huachun Wetzel, Amy N Nordquist, Russell D Newsom, David L White, Peter Genome Biol Method While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0577-x) contains supplementary material, which is available to authorized users. BioMed Central 2015-01-20 2015 /pmc/articles/PMC4333267/ /pubmed/25600152 http://dx.doi.org/10.1186/s13059-014-0577-x Text en © Kelly et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Method
Kelly, Benjamin J
Fitch, James R
Hu, Yangqiu
Corsmeier, Donald J
Zhong, Huachun
Wetzel, Amy N
Nordquist, Russell D
Newsom, David L
White, Peter
Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics
title Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics
title_full Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics
title_fullStr Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics
title_full_unstemmed Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics
title_short Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics
title_sort churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4333267/
https://www.ncbi.nlm.nih.gov/pubmed/25600152
http://dx.doi.org/10.1186/s13059-014-0577-x
work_keys_str_mv AT kellybenjaminj churchillanultrafastdeterministichighlyscalableandbalancedparallelizationstrategyforthediscoveryofhumangeneticvariationinclinicalandpopulationscalegenomics
AT fitchjamesr churchillanultrafastdeterministichighlyscalableandbalancedparallelizationstrategyforthediscoveryofhumangeneticvariationinclinicalandpopulationscalegenomics
AT huyangqiu churchillanultrafastdeterministichighlyscalableandbalancedparallelizationstrategyforthediscoveryofhumangeneticvariationinclinicalandpopulationscalegenomics
AT corsmeierdonaldj churchillanultrafastdeterministichighlyscalableandbalancedparallelizationstrategyforthediscoveryofhumangeneticvariationinclinicalandpopulationscalegenomics
AT zhonghuachun churchillanultrafastdeterministichighlyscalableandbalancedparallelizationstrategyforthediscoveryofhumangeneticvariationinclinicalandpopulationscalegenomics
AT wetzelamyn churchillanultrafastdeterministichighlyscalableandbalancedparallelizationstrategyforthediscoveryofhumangeneticvariationinclinicalandpopulationscalegenomics
AT nordquistrusselld churchillanultrafastdeterministichighlyscalableandbalancedparallelizationstrategyforthediscoveryofhumangeneticvariationinclinicalandpopulationscalegenomics
AT newsomdavidl churchillanultrafastdeterministichighlyscalableandbalancedparallelizationstrategyforthediscoveryofhumangeneticvariationinclinicalandpopulationscalegenomics
AT whitepeter churchillanultrafastdeterministichighlyscalableandbalancedparallelizationstrategyforthediscoveryofhumangeneticvariationinclinicalandpopulationscalegenomics