Cargando…

Exploring effective approaches for haplotype block phasing

BACKGROUND: Knowledge of phase, the specific allele sequence on each copy of homologous chromosomes, is increasingly recognized as critical for detecting certain classes of disease-associated mutations. One approach for detecting such mutations is through phased haplotype association analysis. While...

Descripción completa

Detalles Bibliográficos
Autores principales: Al Bkhetan, Ziad, Zobel, Justin, Kowalczyk, Adam, Verspoor, Karin, Goudey, Benjamin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6822470/
https://www.ncbi.nlm.nih.gov/pubmed/31666002
http://dx.doi.org/10.1186/s12859-019-3095-8
_version_ 1783464344068030464
author Al Bkhetan, Ziad
Zobel, Justin
Kowalczyk, Adam
Verspoor, Karin
Goudey, Benjamin
author_facet Al Bkhetan, Ziad
Zobel, Justin
Kowalczyk, Adam
Verspoor, Karin
Goudey, Benjamin
author_sort Al Bkhetan, Ziad
collection PubMed
description BACKGROUND: Knowledge of phase, the specific allele sequence on each copy of homologous chromosomes, is increasingly recognized as critical for detecting certain classes of disease-associated mutations. One approach for detecting such mutations is through phased haplotype association analysis. While the accuracy of methods for phasing genotype data has been widely explored, there has been little attention given to phasing accuracy at haplotype block scale. Understanding the combined impact of the accuracy of phasing tool and the method used to determine haplotype blocks on the error rate within the determined blocks is essential to conduct accurate haplotype analyses. RESULTS: We present a systematic study exploring the relationship between seven widely used phasing methods and two common methods for determining haplotype blocks. The evaluation focuses on the number of haplotype blocks that are incorrectly phased. Insights from these results are used to develop a haplotype estimator based on a consensus of three tools. The consensus estimator achieved the most accurate phasing in all applied tests. Individually, EAGLE2, BEAGLE and SHAPEIT2 alternate in being the best performing tool in different scenarios. Determining haplotype blocks based on linkage disequilibrium leads to more correctly phased blocks compared to a sliding window approach. We find that there is little difference between phasing sections of a genome (e.g. a gene) compared to phasing entire chromosomes. Finally, we show that the location of phasing error varies when the tools are applied to the same data several times, a finding that could be important for downstream analyses. CONCLUSIONS: The choice of phasing and block determination algorithms and their interaction impacts the accuracy of phased haplotype blocks. This work provides guidance and evidence for the different design choices needed for analyses using haplotype blocks. The study highlights a number of issues that may have limited the replicability of previous haplotype analysis.
format Online
Article
Text
id pubmed-6822470
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-68224702019-11-06 Exploring effective approaches for haplotype block phasing Al Bkhetan, Ziad Zobel, Justin Kowalczyk, Adam Verspoor, Karin Goudey, Benjamin BMC Bioinformatics Research Article BACKGROUND: Knowledge of phase, the specific allele sequence on each copy of homologous chromosomes, is increasingly recognized as critical for detecting certain classes of disease-associated mutations. One approach for detecting such mutations is through phased haplotype association analysis. While the accuracy of methods for phasing genotype data has been widely explored, there has been little attention given to phasing accuracy at haplotype block scale. Understanding the combined impact of the accuracy of phasing tool and the method used to determine haplotype blocks on the error rate within the determined blocks is essential to conduct accurate haplotype analyses. RESULTS: We present a systematic study exploring the relationship between seven widely used phasing methods and two common methods for determining haplotype blocks. The evaluation focuses on the number of haplotype blocks that are incorrectly phased. Insights from these results are used to develop a haplotype estimator based on a consensus of three tools. The consensus estimator achieved the most accurate phasing in all applied tests. Individually, EAGLE2, BEAGLE and SHAPEIT2 alternate in being the best performing tool in different scenarios. Determining haplotype blocks based on linkage disequilibrium leads to more correctly phased blocks compared to a sliding window approach. We find that there is little difference between phasing sections of a genome (e.g. a gene) compared to phasing entire chromosomes. Finally, we show that the location of phasing error varies when the tools are applied to the same data several times, a finding that could be important for downstream analyses. CONCLUSIONS: The choice of phasing and block determination algorithms and their interaction impacts the accuracy of phased haplotype blocks. This work provides guidance and evidence for the different design choices needed for analyses using haplotype blocks. The study highlights a number of issues that may have limited the replicability of previous haplotype analysis. BioMed Central 2019-10-30 /pmc/articles/PMC6822470/ /pubmed/31666002 http://dx.doi.org/10.1186/s12859-019-3095-8 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Al Bkhetan, Ziad
Zobel, Justin
Kowalczyk, Adam
Verspoor, Karin
Goudey, Benjamin
Exploring effective approaches for haplotype block phasing
title Exploring effective approaches for haplotype block phasing
title_full Exploring effective approaches for haplotype block phasing
title_fullStr Exploring effective approaches for haplotype block phasing
title_full_unstemmed Exploring effective approaches for haplotype block phasing
title_short Exploring effective approaches for haplotype block phasing
title_sort exploring effective approaches for haplotype block phasing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6822470/
https://www.ncbi.nlm.nih.gov/pubmed/31666002
http://dx.doi.org/10.1186/s12859-019-3095-8
work_keys_str_mv AT albkhetanziad exploringeffectiveapproachesforhaplotypeblockphasing
AT zobeljustin exploringeffectiveapproachesforhaplotypeblockphasing
AT kowalczykadam exploringeffectiveapproachesforhaplotypeblockphasing
AT verspoorkarin exploringeffectiveapproachesforhaplotypeblockphasing
AT goudeybenjamin exploringeffectiveapproachesforhaplotypeblockphasing