Cargando…
Exploring effective approaches for haplotype block phasing
BACKGROUND: Knowledge of phase, the specific allele sequence on each copy of homologous chromosomes, is increasingly recognized as critical for detecting certain classes of disease-associated mutations. One approach for detecting such mutations is through phased haplotype association analysis. While...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6822470/ https://www.ncbi.nlm.nih.gov/pubmed/31666002 http://dx.doi.org/10.1186/s12859-019-3095-8 |
_version_ | 1783464344068030464 |
---|---|
author | Al Bkhetan, Ziad Zobel, Justin Kowalczyk, Adam Verspoor, Karin Goudey, Benjamin |
author_facet | Al Bkhetan, Ziad Zobel, Justin Kowalczyk, Adam Verspoor, Karin Goudey, Benjamin |
author_sort | Al Bkhetan, Ziad |
collection | PubMed |
description | BACKGROUND: Knowledge of phase, the specific allele sequence on each copy of homologous chromosomes, is increasingly recognized as critical for detecting certain classes of disease-associated mutations. One approach for detecting such mutations is through phased haplotype association analysis. While the accuracy of methods for phasing genotype data has been widely explored, there has been little attention given to phasing accuracy at haplotype block scale. Understanding the combined impact of the accuracy of phasing tool and the method used to determine haplotype blocks on the error rate within the determined blocks is essential to conduct accurate haplotype analyses. RESULTS: We present a systematic study exploring the relationship between seven widely used phasing methods and two common methods for determining haplotype blocks. The evaluation focuses on the number of haplotype blocks that are incorrectly phased. Insights from these results are used to develop a haplotype estimator based on a consensus of three tools. The consensus estimator achieved the most accurate phasing in all applied tests. Individually, EAGLE2, BEAGLE and SHAPEIT2 alternate in being the best performing tool in different scenarios. Determining haplotype blocks based on linkage disequilibrium leads to more correctly phased blocks compared to a sliding window approach. We find that there is little difference between phasing sections of a genome (e.g. a gene) compared to phasing entire chromosomes. Finally, we show that the location of phasing error varies when the tools are applied to the same data several times, a finding that could be important for downstream analyses. CONCLUSIONS: The choice of phasing and block determination algorithms and their interaction impacts the accuracy of phased haplotype blocks. This work provides guidance and evidence for the different design choices needed for analyses using haplotype blocks. The study highlights a number of issues that may have limited the replicability of previous haplotype analysis. |
format | Online Article Text |
id | pubmed-6822470 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-68224702019-11-06 Exploring effective approaches for haplotype block phasing Al Bkhetan, Ziad Zobel, Justin Kowalczyk, Adam Verspoor, Karin Goudey, Benjamin BMC Bioinformatics Research Article BACKGROUND: Knowledge of phase, the specific allele sequence on each copy of homologous chromosomes, is increasingly recognized as critical for detecting certain classes of disease-associated mutations. One approach for detecting such mutations is through phased haplotype association analysis. While the accuracy of methods for phasing genotype data has been widely explored, there has been little attention given to phasing accuracy at haplotype block scale. Understanding the combined impact of the accuracy of phasing tool and the method used to determine haplotype blocks on the error rate within the determined blocks is essential to conduct accurate haplotype analyses. RESULTS: We present a systematic study exploring the relationship between seven widely used phasing methods and two common methods for determining haplotype blocks. The evaluation focuses on the number of haplotype blocks that are incorrectly phased. Insights from these results are used to develop a haplotype estimator based on a consensus of three tools. The consensus estimator achieved the most accurate phasing in all applied tests. Individually, EAGLE2, BEAGLE and SHAPEIT2 alternate in being the best performing tool in different scenarios. Determining haplotype blocks based on linkage disequilibrium leads to more correctly phased blocks compared to a sliding window approach. We find that there is little difference between phasing sections of a genome (e.g. a gene) compared to phasing entire chromosomes. Finally, we show that the location of phasing error varies when the tools are applied to the same data several times, a finding that could be important for downstream analyses. CONCLUSIONS: The choice of phasing and block determination algorithms and their interaction impacts the accuracy of phased haplotype blocks. This work provides guidance and evidence for the different design choices needed for analyses using haplotype blocks. The study highlights a number of issues that may have limited the replicability of previous haplotype analysis. BioMed Central 2019-10-30 /pmc/articles/PMC6822470/ /pubmed/31666002 http://dx.doi.org/10.1186/s12859-019-3095-8 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Al Bkhetan, Ziad Zobel, Justin Kowalczyk, Adam Verspoor, Karin Goudey, Benjamin Exploring effective approaches for haplotype block phasing |
title | Exploring effective approaches for haplotype block phasing |
title_full | Exploring effective approaches for haplotype block phasing |
title_fullStr | Exploring effective approaches for haplotype block phasing |
title_full_unstemmed | Exploring effective approaches for haplotype block phasing |
title_short | Exploring effective approaches for haplotype block phasing |
title_sort | exploring effective approaches for haplotype block phasing |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6822470/ https://www.ncbi.nlm.nih.gov/pubmed/31666002 http://dx.doi.org/10.1186/s12859-019-3095-8 |
work_keys_str_mv | AT albkhetanziad exploringeffectiveapproachesforhaplotypeblockphasing AT zobeljustin exploringeffectiveapproachesforhaplotypeblockphasing AT kowalczykadam exploringeffectiveapproachesforhaplotypeblockphasing AT verspoorkarin exploringeffectiveapproachesforhaplotypeblockphasing AT goudeybenjamin exploringeffectiveapproachesforhaplotypeblockphasing |