Cargando…
Evaluation of haplotype-aware long-read error correction with hifieval
SUMMARY: The PacBio High-Fidelity (HiFi) sequencing technology produces long reads of [Formula: see text] 99% in accuracy. It has enabled the development of a new generation of de novo sequence assemblers, which all have sequencing error correction (EC) as the first step. As HiFi is a new data type,...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10612404/ https://www.ncbi.nlm.nih.gov/pubmed/37851384 http://dx.doi.org/10.1093/bioinformatics/btad631 |
_version_ | 1785128697016090624 |
---|---|
author | Guo, Yujie Feng, Xiaowen Li, Heng |
author_facet | Guo, Yujie Feng, Xiaowen Li, Heng |
author_sort | Guo, Yujie |
collection | PubMed |
description | SUMMARY: The PacBio High-Fidelity (HiFi) sequencing technology produces long reads of [Formula: see text] 99% in accuracy. It has enabled the development of a new generation of de novo sequence assemblers, which all have sequencing error correction (EC) as the first step. As HiFi is a new data type, this critical step has not been evaluated before. Here, we introduced hifieval, a new command-line tool for measuring over- and under-corrections produced by EC algorithms. We assessed the accuracy of the EC components of existing HiFi assemblers on the CHM13 and the HG002 datasets and further investigated the performance of EC methods in challenging regions such as homopolymer regions, centromeric regions, and segmental duplications. Hifieval will help HiFi assemblers to improve EC and assembly quality in the long run. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://github.com/magspho/hifieval. |
format | Online Article Text |
id | pubmed-10612404 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-106124042023-10-29 Evaluation of haplotype-aware long-read error correction with hifieval Guo, Yujie Feng, Xiaowen Li, Heng Bioinformatics Applications Note SUMMARY: The PacBio High-Fidelity (HiFi) sequencing technology produces long reads of [Formula: see text] 99% in accuracy. It has enabled the development of a new generation of de novo sequence assemblers, which all have sequencing error correction (EC) as the first step. As HiFi is a new data type, this critical step has not been evaluated before. Here, we introduced hifieval, a new command-line tool for measuring over- and under-corrections produced by EC algorithms. We assessed the accuracy of the EC components of existing HiFi assemblers on the CHM13 and the HG002 datasets and further investigated the performance of EC methods in challenging regions such as homopolymer regions, centromeric regions, and segmental duplications. Hifieval will help HiFi assemblers to improve EC and assembly quality in the long run. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://github.com/magspho/hifieval. Oxford University Press 2023-10-18 /pmc/articles/PMC10612404/ /pubmed/37851384 http://dx.doi.org/10.1093/bioinformatics/btad631 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Guo, Yujie Feng, Xiaowen Li, Heng Evaluation of haplotype-aware long-read error correction with hifieval |
title | Evaluation of haplotype-aware long-read error correction with hifieval |
title_full | Evaluation of haplotype-aware long-read error correction with hifieval |
title_fullStr | Evaluation of haplotype-aware long-read error correction with hifieval |
title_full_unstemmed | Evaluation of haplotype-aware long-read error correction with hifieval |
title_short | Evaluation of haplotype-aware long-read error correction with hifieval |
title_sort | evaluation of haplotype-aware long-read error correction with hifieval |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10612404/ https://www.ncbi.nlm.nih.gov/pubmed/37851384 http://dx.doi.org/10.1093/bioinformatics/btad631 |
work_keys_str_mv | AT guoyujie evaluationofhaplotypeawarelongreaderrorcorrectionwithhifieval AT fengxiaowen evaluationofhaplotypeawarelongreaderrorcorrectionwithhifieval AT liheng evaluationofhaplotypeawarelongreaderrorcorrectionwithhifieval |