Cargando…

ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing

BACKGROUND: The use of haplotype-based association tests can improve the power of genome-wide association studies. Since the observed genotypes are unordered pairs of alleles, haplotype phase must be inferred. However, estimating haplotype phase is time consuming. When millions of single-nucleotide...

Descripción completa

Detalles Bibliográficos
Autores principales:	Misawa, Kazuharu, Kamatani, Naoyuki
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Software review
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892495/ https://www.ncbi.nlm.nih.gov/pubmed/20525312 http://dx.doi.org/10.1186/1751-0473-5-5

_version_	1782182955792203776
author	Misawa, Kazuharu Kamatani, Naoyuki
author_facet	Misawa, Kazuharu Kamatani, Naoyuki
author_sort	Misawa, Kazuharu
collection	PubMed
description	BACKGROUND: The use of haplotype-based association tests can improve the power of genome-wide association studies. Since the observed genotypes are unordered pairs of alleles, haplotype phase must be inferred. However, estimating haplotype phase is time consuming. When millions of single-nucleotide polymorphisms (SNPs) are analyzed in genome-wide association study, faster methods for haplotype estimation are required. METHODS: We developed a program package for parallel computation of haplotype estimation. Our program package, ParaHaplo 2.0, is intended for use in workstation clusters using the Intel Message Passing Interface (MPI). We compared the performance of our algorithm to that of the regular permutation test on both Japanese in Tokyo, Japan and Han Chinese in Beijing, China of the HapMap dataset. RESULTS: Parallel version of ParaHaplo 2.0 can estimate haplotypes 100 times faster than a non-parallel version of the ParaHaplo. CONCLUSION: ParaHaplo 2.0 is an invaluable tool for conducting haplotype-based genome-wide association studies (GWAS). The need for fast haplotype estimation using parallel computing will become increasingly important as the data sizes of such projects continue to increase. The executable binaries and program sources of ParaHaplo are available at the following address: http://en.sourceforge.jp/projects/parallelgwas/releases/
format	Text
id	pubmed-2892495
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-28924952010-06-26 ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing Misawa, Kazuharu Kamatani, Naoyuki Source Code Biol Med Software review BACKGROUND: The use of haplotype-based association tests can improve the power of genome-wide association studies. Since the observed genotypes are unordered pairs of alleles, haplotype phase must be inferred. However, estimating haplotype phase is time consuming. When millions of single-nucleotide polymorphisms (SNPs) are analyzed in genome-wide association study, faster methods for haplotype estimation are required. METHODS: We developed a program package for parallel computation of haplotype estimation. Our program package, ParaHaplo 2.0, is intended for use in workstation clusters using the Intel Message Passing Interface (MPI). We compared the performance of our algorithm to that of the regular permutation test on both Japanese in Tokyo, Japan and Han Chinese in Beijing, China of the HapMap dataset. RESULTS: Parallel version of ParaHaplo 2.0 can estimate haplotypes 100 times faster than a non-parallel version of the ParaHaplo. CONCLUSION: ParaHaplo 2.0 is an invaluable tool for conducting haplotype-based genome-wide association studies (GWAS). The need for fast haplotype estimation using parallel computing will become increasingly important as the data sizes of such projects continue to increase. The executable binaries and program sources of ParaHaplo are available at the following address: http://en.sourceforge.jp/projects/parallelgwas/releases/ BioMed Central 2010-06-04 /pmc/articles/PMC2892495/ /pubmed/20525312 http://dx.doi.org/10.1186/1751-0473-5-5 Text en Copyright ©2010 Misawa and Kamatani; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software review Misawa, Kazuharu Kamatani, Naoyuki ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing
title	ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing
title_full	ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing
title_fullStr	ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing
title_full_unstemmed	ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing
title_short	ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing
title_sort	parahaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing
topic	Software review
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892495/ https://www.ncbi.nlm.nih.gov/pubmed/20525312 http://dx.doi.org/10.1186/1751-0473-5-5
work_keys_str_mv	AT misawakazuharu parahaplo20aprogrampackageforhaplotypeestimationandhaplotypebasedwholegenomeassociationstudyusingparallelcomputing AT kamataninaoyuki parahaplo20aprogrampackageforhaplotypeestimationandhaplotypebasedwholegenomeassociationstudyusingparallelcomputing

ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing

Ejemplares similares