Cargando…
A comparative evaluation of hybrid error correction methods for error-prone long reads
BACKGROUND: Third-generation sequencing technologies have advanced the progress of the biological research by generating reads that are substantially longer than second-generation sequencing technologies. However, their notorious high error rate impedes straightforward data analysis and limits their...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6362602/ https://www.ncbi.nlm.nih.gov/pubmed/30717772 http://dx.doi.org/10.1186/s13059-018-1605-z |
_version_ | 1783392954482688000 |
---|---|
author | Fu, Shuhua Wang, Anqi Au, Kin Fai |
author_facet | Fu, Shuhua Wang, Anqi Au, Kin Fai |
author_sort | Fu, Shuhua |
collection | PubMed |
description | BACKGROUND: Third-generation sequencing technologies have advanced the progress of the biological research by generating reads that are substantially longer than second-generation sequencing technologies. However, their notorious high error rate impedes straightforward data analysis and limits their application. A handful of error correction methods for these error-prone long reads have been developed to date. The output data quality is very important for downstream analysis, whereas computing resources could limit the utility of some computing-intense tools. There is a lack of standardized assessments for these long-read error-correction methods. RESULTS: Here, we present a comparative performance assessment of ten state-of-the-art error-correction methods for long reads. We established a common set of benchmarks for performance assessment, including sensitivity, accuracy, output rate, alignment rate, output read length, run time, and memory usage, as well as the effects of error correction on two downstream applications of long reads: de novo assembly and resolving haplotype sequences. CONCLUSIONS: Taking into account all of these metrics, we provide a suggestive guideline for method choice based on available data size, computing resources, and individual research goals. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-018-1605-z) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6362602 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63626022019-02-14 A comparative evaluation of hybrid error correction methods for error-prone long reads Fu, Shuhua Wang, Anqi Au, Kin Fai Genome Biol Research BACKGROUND: Third-generation sequencing technologies have advanced the progress of the biological research by generating reads that are substantially longer than second-generation sequencing technologies. However, their notorious high error rate impedes straightforward data analysis and limits their application. A handful of error correction methods for these error-prone long reads have been developed to date. The output data quality is very important for downstream analysis, whereas computing resources could limit the utility of some computing-intense tools. There is a lack of standardized assessments for these long-read error-correction methods. RESULTS: Here, we present a comparative performance assessment of ten state-of-the-art error-correction methods for long reads. We established a common set of benchmarks for performance assessment, including sensitivity, accuracy, output rate, alignment rate, output read length, run time, and memory usage, as well as the effects of error correction on two downstream applications of long reads: de novo assembly and resolving haplotype sequences. CONCLUSIONS: Taking into account all of these metrics, we provide a suggestive guideline for method choice based on available data size, computing resources, and individual research goals. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-018-1605-z) contains supplementary material, which is available to authorized users. BioMed Central 2019-02-04 /pmc/articles/PMC6362602/ /pubmed/30717772 http://dx.doi.org/10.1186/s13059-018-1605-z Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Fu, Shuhua Wang, Anqi Au, Kin Fai A comparative evaluation of hybrid error correction methods for error-prone long reads |
title | A comparative evaluation of hybrid error correction methods for error-prone long reads |
title_full | A comparative evaluation of hybrid error correction methods for error-prone long reads |
title_fullStr | A comparative evaluation of hybrid error correction methods for error-prone long reads |
title_full_unstemmed | A comparative evaluation of hybrid error correction methods for error-prone long reads |
title_short | A comparative evaluation of hybrid error correction methods for error-prone long reads |
title_sort | comparative evaluation of hybrid error correction methods for error-prone long reads |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6362602/ https://www.ncbi.nlm.nih.gov/pubmed/30717772 http://dx.doi.org/10.1186/s13059-018-1605-z |
work_keys_str_mv | AT fushuhua acomparativeevaluationofhybriderrorcorrectionmethodsforerrorpronelongreads AT wanganqi acomparativeevaluationofhybriderrorcorrectionmethodsforerrorpronelongreads AT aukinfai acomparativeevaluationofhybriderrorcorrectionmethodsforerrorpronelongreads AT fushuhua comparativeevaluationofhybriderrorcorrectionmethodsforerrorpronelongreads AT wanganqi comparativeevaluationofhybriderrorcorrectionmethodsforerrorpronelongreads AT aukinfai comparativeevaluationofhybriderrorcorrectionmethodsforerrorpronelongreads |