Cargando…

PhaseME: Automatic rapid assessment of phasing quality and phasing improvement

BACKGROUND: The detection of which mutations are occurring on the same DNA molecule is essential to predict their consequences. This can be achieved by phasing the genomic variations. Nevertheless, state-of-the-art haplotype phasing is currently a black box in which the accuracy and quality of the r...

Descripción completa

Detalles Bibliográficos
Autores principales: Majidian, Sina, Sedlazeck, Fritz J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7379178/
https://www.ncbi.nlm.nih.gov/pubmed/32706368
http://dx.doi.org/10.1093/gigascience/giaa078
_version_ 1783562581809561600
author Majidian, Sina
Sedlazeck, Fritz J
author_facet Majidian, Sina
Sedlazeck, Fritz J
author_sort Majidian, Sina
collection PubMed
description BACKGROUND: The detection of which mutations are occurring on the same DNA molecule is essential to predict their consequences. This can be achieved by phasing the genomic variations. Nevertheless, state-of-the-art haplotype phasing is currently a black box in which the accuracy and quality of the reconstructed haplotypes are hard to assess. FINDINGS: Here we present PhaseME, a versatile method to provide insights into and improvement of sample phasing results based on linkage data. We showcase the performance and the importance of PhaseME by comparing phasing information obtained from Pacific Biosciences including both continuous long reads and high-quality consensus reads, Oxford Nanopore Technologies, 10x Genomics, and Illumina sequencing technologies. We found that 10x Genomics and Oxford Nanopore phasing can be significantly improved while retaining a high N50 and completeness of phase blocks. PhaseME generates reports and summary plots to provide insights into phasing performance and correctness. We observed unique phasing issues for each of the sequencing technologies, highlighting the necessity of quality assessments. PhaseME is able to decrease the Hamming error rate significantly by 22.4% on average across all 5 technologies. Additionally, a significant improvement is obtained in the reduction of long switch errors. Especially for high-quality consensus reads, the improvement is 54.6% in return for only a 5% decrease in phase block N50 length. CONCLUSIONS: PhaseME is a universal method to assess the phasing quality and accuracy and improves the quality of phasing using linkage information. The package is freely available at https://github.com/smajidian/phaseme.
format Online
Article
Text
id pubmed-7379178
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73791782020-07-29 PhaseME: Automatic rapid assessment of phasing quality and phasing improvement Majidian, Sina Sedlazeck, Fritz J Gigascience Technical Note BACKGROUND: The detection of which mutations are occurring on the same DNA molecule is essential to predict their consequences. This can be achieved by phasing the genomic variations. Nevertheless, state-of-the-art haplotype phasing is currently a black box in which the accuracy and quality of the reconstructed haplotypes are hard to assess. FINDINGS: Here we present PhaseME, a versatile method to provide insights into and improvement of sample phasing results based on linkage data. We showcase the performance and the importance of PhaseME by comparing phasing information obtained from Pacific Biosciences including both continuous long reads and high-quality consensus reads, Oxford Nanopore Technologies, 10x Genomics, and Illumina sequencing technologies. We found that 10x Genomics and Oxford Nanopore phasing can be significantly improved while retaining a high N50 and completeness of phase blocks. PhaseME generates reports and summary plots to provide insights into phasing performance and correctness. We observed unique phasing issues for each of the sequencing technologies, highlighting the necessity of quality assessments. PhaseME is able to decrease the Hamming error rate significantly by 22.4% on average across all 5 technologies. Additionally, a significant improvement is obtained in the reduction of long switch errors. Especially for high-quality consensus reads, the improvement is 54.6% in return for only a 5% decrease in phase block N50 length. CONCLUSIONS: PhaseME is a universal method to assess the phasing quality and accuracy and improves the quality of phasing using linkage information. The package is freely available at https://github.com/smajidian/phaseme. Oxford University Press 2020-07-24 /pmc/articles/PMC7379178/ /pubmed/32706368 http://dx.doi.org/10.1093/gigascience/giaa078 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Majidian, Sina
Sedlazeck, Fritz J
PhaseME: Automatic rapid assessment of phasing quality and phasing improvement
title PhaseME: Automatic rapid assessment of phasing quality and phasing improvement
title_full PhaseME: Automatic rapid assessment of phasing quality and phasing improvement
title_fullStr PhaseME: Automatic rapid assessment of phasing quality and phasing improvement
title_full_unstemmed PhaseME: Automatic rapid assessment of phasing quality and phasing improvement
title_short PhaseME: Automatic rapid assessment of phasing quality and phasing improvement
title_sort phaseme: automatic rapid assessment of phasing quality and phasing improvement
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7379178/
https://www.ncbi.nlm.nih.gov/pubmed/32706368
http://dx.doi.org/10.1093/gigascience/giaa078
work_keys_str_mv AT majidiansina phasemeautomaticrapidassessmentofphasingqualityandphasingimprovement
AT sedlazeckfritzj phasemeautomaticrapidassessmentofphasingqualityandphasingimprovement