Cargando…

ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing

BACKGROUND: Since more than a million single-nucleotide polymorphisms (SNPs) are analyzed in any given genome-wide association study (GWAS), performing multiple comparisons can be problematic. To cope with multiple-comparison problems in GWAS, haplotype-based algorithms were developed to correct for...

Descripción completa

Detalles Bibliográficos
Autores principales: Misawa, Kazuharu, Kamatani, Naoyuki
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2774321/
https://www.ncbi.nlm.nih.gov/pubmed/19845960
http://dx.doi.org/10.1186/1751-0473-4-7
_version_ 1782173929161359360
author Misawa, Kazuharu
Kamatani, Naoyuki
author_facet Misawa, Kazuharu
Kamatani, Naoyuki
author_sort Misawa, Kazuharu
collection PubMed
description BACKGROUND: Since more than a million single-nucleotide polymorphisms (SNPs) are analyzed in any given genome-wide association study (GWAS), performing multiple comparisons can be problematic. To cope with multiple-comparison problems in GWAS, haplotype-based algorithms were developed to correct for multiple comparisons at multiple SNP loci in linkage disequilibrium. A permutation test can also control problems inherent in multiple testing; however, both the calculation of exact probability and the execution of permutation tests are time-consuming. Faster methods for calculating exact probabilities and executing permutation tests are required. METHODS: We developed a set of computer programs for the parallel computation of accurate P-values in haplotype-based GWAS. Our program, ParaHaplo, is intended for workstation clusters using the Intel Message Passing Interface (MPI). We compared the performance of our algorithm to that of the regular permutation test on JPT and CHB of HapMap. RESULTS: ParaHaplo can detect smaller differences between 2 populations than SNP-based GWAS. We also found that parallel-computing techniques made ParaHaplo 100-fold faster than a non-parallel version of the program. CONCLUSION: ParaHaplo is a useful tool in conducting haplotype-based GWAS. Since the data sizes of such projects continue to increase, the use of fast computations with parallel computing--such as that used in ParaHaplo--will become increasingly important. The executable binaries and program sources of ParaHaplo are available at the following address:
format Text
id pubmed-2774321
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27743212009-11-07 ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing Misawa, Kazuharu Kamatani, Naoyuki Source Code Biol Med Software Review BACKGROUND: Since more than a million single-nucleotide polymorphisms (SNPs) are analyzed in any given genome-wide association study (GWAS), performing multiple comparisons can be problematic. To cope with multiple-comparison problems in GWAS, haplotype-based algorithms were developed to correct for multiple comparisons at multiple SNP loci in linkage disequilibrium. A permutation test can also control problems inherent in multiple testing; however, both the calculation of exact probability and the execution of permutation tests are time-consuming. Faster methods for calculating exact probabilities and executing permutation tests are required. METHODS: We developed a set of computer programs for the parallel computation of accurate P-values in haplotype-based GWAS. Our program, ParaHaplo, is intended for workstation clusters using the Intel Message Passing Interface (MPI). We compared the performance of our algorithm to that of the regular permutation test on JPT and CHB of HapMap. RESULTS: ParaHaplo can detect smaller differences between 2 populations than SNP-based GWAS. We also found that parallel-computing techniques made ParaHaplo 100-fold faster than a non-parallel version of the program. CONCLUSION: ParaHaplo is a useful tool in conducting haplotype-based GWAS. Since the data sizes of such projects continue to increase, the use of fast computations with parallel computing--such as that used in ParaHaplo--will become increasingly important. The executable binaries and program sources of ParaHaplo are available at the following address: BioMed Central 2009-10-21 /pmc/articles/PMC2774321/ /pubmed/19845960 http://dx.doi.org/10.1186/1751-0473-4-7 Text en Copyright © 2009 Misawa and Kamatani; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Review
Misawa, Kazuharu
Kamatani, Naoyuki
ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing
title ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing
title_full ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing
title_fullStr ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing
title_full_unstemmed ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing
title_short ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing
title_sort parahaplo: a program package for haplotype-based whole-genome association study using parallel computing
topic Software Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2774321/
https://www.ncbi.nlm.nih.gov/pubmed/19845960
http://dx.doi.org/10.1186/1751-0473-4-7
work_keys_str_mv AT misawakazuharu parahaploaprogrampackageforhaplotypebasedwholegenomeassociationstudyusingparallelcomputing
AT kamataninaoyuki parahaploaprogrampackageforhaplotypebasedwholegenomeassociationstudyusingparallelcomputing