Cargando…

Fiona: a parallel and automatic strategy for read error correction

Motivation: Automatic error correction of high-throughput sequencing data can have a dramatic impact on the amount of usable base pairs and their quality. It has been shown that the performance of tasks such as de novo genome assembly and SNP calling can be dramatically improved after read error cor...

Descripción completa

Detalles Bibliográficos
Autores principales: Schulz, Marcel H., Weese, David, Holtgrewe, Manuel, Dimitrova, Viktoria, Niu, Sijia, Reinert, Knut, Richard, Hugues
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147893/
https://www.ncbi.nlm.nih.gov/pubmed/25161220
http://dx.doi.org/10.1093/bioinformatics/btu440
_version_ 1782332531657408512
author Schulz, Marcel H.
Weese, David
Holtgrewe, Manuel
Dimitrova, Viktoria
Niu, Sijia
Reinert, Knut
Richard, Hugues
author_facet Schulz, Marcel H.
Weese, David
Holtgrewe, Manuel
Dimitrova, Viktoria
Niu, Sijia
Reinert, Knut
Richard, Hugues
author_sort Schulz, Marcel H.
collection PubMed
description Motivation: Automatic error correction of high-throughput sequencing data can have a dramatic impact on the amount of usable base pairs and their quality. It has been shown that the performance of tasks such as de novo genome assembly and SNP calling can be dramatically improved after read error correction. While a large number of methods specialized for correcting substitution errors as found in Illumina data exist, few methods for the correction of indel errors, common to technologies like 454 or Ion Torrent, have been proposed. Results: We present Fiona, a new stand-alone read error–correction method. Fiona provides a new statistical approach for sequencing error detection and optimal error correction and estimates its parameters automatically. Fiona is able to correct substitution, insertion and deletion errors and can be applied to any sequencing technology. It uses an efficient implementation of the partial suffix array to detect read overlaps with different seed lengths in parallel. We tested Fiona on several real datasets from a variety of organisms with different read lengths and compared its performance with state-of-the-art methods. Fiona shows a constantly higher correction accuracy over a broad range of datasets from 454 and Ion Torrent sequencers, without compromise in speed. Conclusion: Fiona is an accurate parameter-free read error–correction method that can be run on inexpensive hardware and can make use of multicore parallelization whenever available. Fiona was implemented using the SeqAn library for sequence analysis and is publicly available for download at http://www.seqan.de/projects/fiona. Contact: mschulz@mmci.uni-saarland.de or hugues.richard@upmc.fr Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4147893
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-41478932014-09-02 Fiona: a parallel and automatic strategy for read error correction Schulz, Marcel H. Weese, David Holtgrewe, Manuel Dimitrova, Viktoria Niu, Sijia Reinert, Knut Richard, Hugues Bioinformatics Eccb 2014 Proceedings Papers Committee Motivation: Automatic error correction of high-throughput sequencing data can have a dramatic impact on the amount of usable base pairs and their quality. It has been shown that the performance of tasks such as de novo genome assembly and SNP calling can be dramatically improved after read error correction. While a large number of methods specialized for correcting substitution errors as found in Illumina data exist, few methods for the correction of indel errors, common to technologies like 454 or Ion Torrent, have been proposed. Results: We present Fiona, a new stand-alone read error–correction method. Fiona provides a new statistical approach for sequencing error detection and optimal error correction and estimates its parameters automatically. Fiona is able to correct substitution, insertion and deletion errors and can be applied to any sequencing technology. It uses an efficient implementation of the partial suffix array to detect read overlaps with different seed lengths in parallel. We tested Fiona on several real datasets from a variety of organisms with different read lengths and compared its performance with state-of-the-art methods. Fiona shows a constantly higher correction accuracy over a broad range of datasets from 454 and Ion Torrent sequencers, without compromise in speed. Conclusion: Fiona is an accurate parameter-free read error–correction method that can be run on inexpensive hardware and can make use of multicore parallelization whenever available. Fiona was implemented using the SeqAn library for sequence analysis and is publicly available for download at http://www.seqan.de/projects/fiona. Contact: mschulz@mmci.uni-saarland.de or hugues.richard@upmc.fr Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-09-01 2014-08-22 /pmc/articles/PMC4147893/ /pubmed/25161220 http://dx.doi.org/10.1093/bioinformatics/btu440 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Eccb 2014 Proceedings Papers Committee
Schulz, Marcel H.
Weese, David
Holtgrewe, Manuel
Dimitrova, Viktoria
Niu, Sijia
Reinert, Knut
Richard, Hugues
Fiona: a parallel and automatic strategy for read error correction
title Fiona: a parallel and automatic strategy for read error correction
title_full Fiona: a parallel and automatic strategy for read error correction
title_fullStr Fiona: a parallel and automatic strategy for read error correction
title_full_unstemmed Fiona: a parallel and automatic strategy for read error correction
title_short Fiona: a parallel and automatic strategy for read error correction
title_sort fiona: a parallel and automatic strategy for read error correction
topic Eccb 2014 Proceedings Papers Committee
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147893/
https://www.ncbi.nlm.nih.gov/pubmed/25161220
http://dx.doi.org/10.1093/bioinformatics/btu440
work_keys_str_mv AT schulzmarcelh fionaaparallelandautomaticstrategyforreaderrorcorrection
AT weesedavid fionaaparallelandautomaticstrategyforreaderrorcorrection
AT holtgrewemanuel fionaaparallelandautomaticstrategyforreaderrorcorrection
AT dimitrovaviktoria fionaaparallelandautomaticstrategyforreaderrorcorrection
AT niusijia fionaaparallelandautomaticstrategyforreaderrorcorrection
AT reinertknut fionaaparallelandautomaticstrategyforreaderrorcorrection
AT richardhugues fionaaparallelandautomaticstrategyforreaderrorcorrection