Cargando…

Error correction of high-throughput sequencing datasets with non-uniform coverage

Motivation: The continuing improvements to high-throughput sequencing (HTS) platforms have begun to unfold a myriad of new applications. As a result, error correction of sequencing reads remains an important problem. Though several tools do an excellent job of correcting datasets where the reads are...

Descripción completa

Detalles Bibliográficos
Autores principales: Medvedev, Paul, Scott, Eric, Kakaradov, Boyko, Pevzner, Pavel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117386/
https://www.ncbi.nlm.nih.gov/pubmed/21685062
http://dx.doi.org/10.1093/bioinformatics/btr208
_version_ 1782206327275126784
author Medvedev, Paul
Scott, Eric
Kakaradov, Boyko
Pevzner, Pavel
author_facet Medvedev, Paul
Scott, Eric
Kakaradov, Boyko
Pevzner, Pavel
author_sort Medvedev, Paul
collection PubMed
description Motivation: The continuing improvements to high-throughput sequencing (HTS) platforms have begun to unfold a myriad of new applications. As a result, error correction of sequencing reads remains an important problem. Though several tools do an excellent job of correcting datasets where the reads are sampled close to uniformly, the problem of correcting reads coming from drastically non-uniform datasets, such as those from single-cell sequencing, remains open. Results: In this article, we develop the method Hammer for error correction without any uniformity assumptions. Hammer is based on a combination of a Hamming graph and a simple probabilistic model for sequencing errors. It is a simple and adaptable algorithm that improves on other tools on non-uniform single-cell data, while achieving comparable results on normal multi-cell data. Availability: http://www.cs.toronto.edu/~pashadag. Contact: pmedvedev@cs.ucsd.edu
format Online
Article
Text
id pubmed-3117386
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31173862011-06-17 Error correction of high-throughput sequencing datasets with non-uniform coverage Medvedev, Paul Scott, Eric Kakaradov, Boyko Pevzner, Pavel Bioinformatics Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria Motivation: The continuing improvements to high-throughput sequencing (HTS) platforms have begun to unfold a myriad of new applications. As a result, error correction of sequencing reads remains an important problem. Though several tools do an excellent job of correcting datasets where the reads are sampled close to uniformly, the problem of correcting reads coming from drastically non-uniform datasets, such as those from single-cell sequencing, remains open. Results: In this article, we develop the method Hammer for error correction without any uniformity assumptions. Hammer is based on a combination of a Hamming graph and a simple probabilistic model for sequencing errors. It is a simple and adaptable algorithm that improves on other tools on non-uniform single-cell data, while achieving comparable results on normal multi-cell data. Availability: http://www.cs.toronto.edu/~pashadag. Contact: pmedvedev@cs.ucsd.edu Oxford University Press 2011-07-01 2011-06-14 /pmc/articles/PMC3117386/ /pubmed/21685062 http://dx.doi.org/10.1093/bioinformatics/btr208 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria
Medvedev, Paul
Scott, Eric
Kakaradov, Boyko
Pevzner, Pavel
Error correction of high-throughput sequencing datasets with non-uniform coverage
title Error correction of high-throughput sequencing datasets with non-uniform coverage
title_full Error correction of high-throughput sequencing datasets with non-uniform coverage
title_fullStr Error correction of high-throughput sequencing datasets with non-uniform coverage
title_full_unstemmed Error correction of high-throughput sequencing datasets with non-uniform coverage
title_short Error correction of high-throughput sequencing datasets with non-uniform coverage
title_sort error correction of high-throughput sequencing datasets with non-uniform coverage
topic Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117386/
https://www.ncbi.nlm.nih.gov/pubmed/21685062
http://dx.doi.org/10.1093/bioinformatics/btr208
work_keys_str_mv AT medvedevpaul errorcorrectionofhighthroughputsequencingdatasetswithnonuniformcoverage
AT scotteric errorcorrectionofhighthroughputsequencingdatasetswithnonuniformcoverage
AT kakaradovboyko errorcorrectionofhighthroughputsequencingdatasetswithnonuniformcoverage
AT pevznerpavel errorcorrectionofhighthroughputsequencingdatasetswithnonuniformcoverage