Cargando…
PhaseME: Automatic rapid assessment of phasing quality and phasing improvement
BACKGROUND: The detection of which mutations are occurring on the same DNA molecule is essential to predict their consequences. This can be achieved by phasing the genomic variations. Nevertheless, state-of-the-art haplotype phasing is currently a black box in which the accuracy and quality of the r...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7379178/ https://www.ncbi.nlm.nih.gov/pubmed/32706368 http://dx.doi.org/10.1093/gigascience/giaa078 |
_version_ | 1783562581809561600 |
---|---|
author | Majidian, Sina Sedlazeck, Fritz J |
author_facet | Majidian, Sina Sedlazeck, Fritz J |
author_sort | Majidian, Sina |
collection | PubMed |
description | BACKGROUND: The detection of which mutations are occurring on the same DNA molecule is essential to predict their consequences. This can be achieved by phasing the genomic variations. Nevertheless, state-of-the-art haplotype phasing is currently a black box in which the accuracy and quality of the reconstructed haplotypes are hard to assess. FINDINGS: Here we present PhaseME, a versatile method to provide insights into and improvement of sample phasing results based on linkage data. We showcase the performance and the importance of PhaseME by comparing phasing information obtained from Pacific Biosciences including both continuous long reads and high-quality consensus reads, Oxford Nanopore Technologies, 10x Genomics, and Illumina sequencing technologies. We found that 10x Genomics and Oxford Nanopore phasing can be significantly improved while retaining a high N50 and completeness of phase blocks. PhaseME generates reports and summary plots to provide insights into phasing performance and correctness. We observed unique phasing issues for each of the sequencing technologies, highlighting the necessity of quality assessments. PhaseME is able to decrease the Hamming error rate significantly by 22.4% on average across all 5 technologies. Additionally, a significant improvement is obtained in the reduction of long switch errors. Especially for high-quality consensus reads, the improvement is 54.6% in return for only a 5% decrease in phase block N50 length. CONCLUSIONS: PhaseME is a universal method to assess the phasing quality and accuracy and improves the quality of phasing using linkage information. The package is freely available at https://github.com/smajidian/phaseme. |
format | Online Article Text |
id | pubmed-7379178 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-73791782020-07-29 PhaseME: Automatic rapid assessment of phasing quality and phasing improvement Majidian, Sina Sedlazeck, Fritz J Gigascience Technical Note BACKGROUND: The detection of which mutations are occurring on the same DNA molecule is essential to predict their consequences. This can be achieved by phasing the genomic variations. Nevertheless, state-of-the-art haplotype phasing is currently a black box in which the accuracy and quality of the reconstructed haplotypes are hard to assess. FINDINGS: Here we present PhaseME, a versatile method to provide insights into and improvement of sample phasing results based on linkage data. We showcase the performance and the importance of PhaseME by comparing phasing information obtained from Pacific Biosciences including both continuous long reads and high-quality consensus reads, Oxford Nanopore Technologies, 10x Genomics, and Illumina sequencing technologies. We found that 10x Genomics and Oxford Nanopore phasing can be significantly improved while retaining a high N50 and completeness of phase blocks. PhaseME generates reports and summary plots to provide insights into phasing performance and correctness. We observed unique phasing issues for each of the sequencing technologies, highlighting the necessity of quality assessments. PhaseME is able to decrease the Hamming error rate significantly by 22.4% on average across all 5 technologies. Additionally, a significant improvement is obtained in the reduction of long switch errors. Especially for high-quality consensus reads, the improvement is 54.6% in return for only a 5% decrease in phase block N50 length. CONCLUSIONS: PhaseME is a universal method to assess the phasing quality and accuracy and improves the quality of phasing using linkage information. The package is freely available at https://github.com/smajidian/phaseme. Oxford University Press 2020-07-24 /pmc/articles/PMC7379178/ /pubmed/32706368 http://dx.doi.org/10.1093/gigascience/giaa078 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Majidian, Sina Sedlazeck, Fritz J PhaseME: Automatic rapid assessment of phasing quality and phasing improvement |
title | PhaseME: Automatic rapid assessment of phasing quality and phasing improvement |
title_full | PhaseME: Automatic rapid assessment of phasing quality and phasing improvement |
title_fullStr | PhaseME: Automatic rapid assessment of phasing quality and phasing improvement |
title_full_unstemmed | PhaseME: Automatic rapid assessment of phasing quality and phasing improvement |
title_short | PhaseME: Automatic rapid assessment of phasing quality and phasing improvement |
title_sort | phaseme: automatic rapid assessment of phasing quality and phasing improvement |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7379178/ https://www.ncbi.nlm.nih.gov/pubmed/32706368 http://dx.doi.org/10.1093/gigascience/giaa078 |
work_keys_str_mv | AT majidiansina phasemeautomaticrapidassessmentofphasingqualityandphasingimprovement AT sedlazeckfritzj phasemeautomaticrapidassessmentofphasingqualityandphasingimprovement |