Cargando…

Pollux: platform independent error correction of single and mixed genomes

BACKGROUND: Second-generation sequencers generate millions of relatively short, but error-prone, reads. These errors make sequence assembly and other downstream projects more challenging. Correcting these errors improves the quality of assemblies and projects which benefit from error-free reads. RES...

Descripción completa

Detalles Bibliográficos
Autores principales: Marinier, Eric, Brown, Daniel G, McConkey, Brendan J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4307147/
https://www.ncbi.nlm.nih.gov/pubmed/25592313
http://dx.doi.org/10.1186/s12859-014-0435-6
_version_ 1782354409452208128
author Marinier, Eric
Brown, Daniel G
McConkey, Brendan J
author_facet Marinier, Eric
Brown, Daniel G
McConkey, Brendan J
author_sort Marinier, Eric
collection PubMed
description BACKGROUND: Second-generation sequencers generate millions of relatively short, but error-prone, reads. These errors make sequence assembly and other downstream projects more challenging. Correcting these errors improves the quality of assemblies and projects which benefit from error-free reads. RESULTS: We have developed a general-purpose error corrector that corrects errors introduced by Illumina, Ion Torrent, and Roche 454 sequencing technologies and can be applied to single- or mixed-genome data. In addition to correcting substitution errors, we locate and correct insertion, deletion, and homopolymer errors while remaining sensitive to low coverage areas of sequencing projects. Using published data sets, we correct 94% of Illumina MiSeq errors, 88% of Ion Torrent PGM errors, 85% of Roche 454 GS Junior errors. Introduced errors are 20 to 70 times more rare than successfully corrected errors. Furthermore, we show that the quality of assemblies improves when reads are corrected by our software. CONCLUSIONS: Pollux is highly effective at correcting errors across platforms, and is consistently able to perform as well or better than currently available error correction software. Pollux provides general-purpose error correction and may be used in applications with or without assembly.
format Online
Article
Text
id pubmed-4307147
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43071472015-01-28 Pollux: platform independent error correction of single and mixed genomes Marinier, Eric Brown, Daniel G McConkey, Brendan J BMC Bioinformatics Software BACKGROUND: Second-generation sequencers generate millions of relatively short, but error-prone, reads. These errors make sequence assembly and other downstream projects more challenging. Correcting these errors improves the quality of assemblies and projects which benefit from error-free reads. RESULTS: We have developed a general-purpose error corrector that corrects errors introduced by Illumina, Ion Torrent, and Roche 454 sequencing technologies and can be applied to single- or mixed-genome data. In addition to correcting substitution errors, we locate and correct insertion, deletion, and homopolymer errors while remaining sensitive to low coverage areas of sequencing projects. Using published data sets, we correct 94% of Illumina MiSeq errors, 88% of Ion Torrent PGM errors, 85% of Roche 454 GS Junior errors. Introduced errors are 20 to 70 times more rare than successfully corrected errors. Furthermore, we show that the quality of assemblies improves when reads are corrected by our software. CONCLUSIONS: Pollux is highly effective at correcting errors across platforms, and is consistently able to perform as well or better than currently available error correction software. Pollux provides general-purpose error correction and may be used in applications with or without assembly. BioMed Central 2015-01-16 /pmc/articles/PMC4307147/ /pubmed/25592313 http://dx.doi.org/10.1186/s12859-014-0435-6 Text en © Marinier et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Marinier, Eric
Brown, Daniel G
McConkey, Brendan J
Pollux: platform independent error correction of single and mixed genomes
title Pollux: platform independent error correction of single and mixed genomes
title_full Pollux: platform independent error correction of single and mixed genomes
title_fullStr Pollux: platform independent error correction of single and mixed genomes
title_full_unstemmed Pollux: platform independent error correction of single and mixed genomes
title_short Pollux: platform independent error correction of single and mixed genomes
title_sort pollux: platform independent error correction of single and mixed genomes
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4307147/
https://www.ncbi.nlm.nih.gov/pubmed/25592313
http://dx.doi.org/10.1186/s12859-014-0435-6
work_keys_str_mv AT mariniereric polluxplatformindependenterrorcorrectionofsingleandmixedgenomes
AT browndanielg polluxplatformindependenterrorcorrectionofsingleandmixedgenomes
AT mcconkeybrendanj polluxplatformindependenterrorcorrectionofsingleandmixedgenomes