Cargando…
Error correction of high-throughput sequencing datasets with non-uniform coverage
Motivation: The continuing improvements to high-throughput sequencing (HTS) platforms have begun to unfold a myriad of new applications. As a result, error correction of sequencing reads remains an important problem. Though several tools do an excellent job of correcting datasets where the reads are...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117386/ https://www.ncbi.nlm.nih.gov/pubmed/21685062 http://dx.doi.org/10.1093/bioinformatics/btr208 |
_version_ | 1782206327275126784 |
---|---|
author | Medvedev, Paul Scott, Eric Kakaradov, Boyko Pevzner, Pavel |
author_facet | Medvedev, Paul Scott, Eric Kakaradov, Boyko Pevzner, Pavel |
author_sort | Medvedev, Paul |
collection | PubMed |
description | Motivation: The continuing improvements to high-throughput sequencing (HTS) platforms have begun to unfold a myriad of new applications. As a result, error correction of sequencing reads remains an important problem. Though several tools do an excellent job of correcting datasets where the reads are sampled close to uniformly, the problem of correcting reads coming from drastically non-uniform datasets, such as those from single-cell sequencing, remains open. Results: In this article, we develop the method Hammer for error correction without any uniformity assumptions. Hammer is based on a combination of a Hamming graph and a simple probabilistic model for sequencing errors. It is a simple and adaptable algorithm that improves on other tools on non-uniform single-cell data, while achieving comparable results on normal multi-cell data. Availability: http://www.cs.toronto.edu/~pashadag. Contact: pmedvedev@cs.ucsd.edu |
format | Online Article Text |
id | pubmed-3117386 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-31173862011-06-17 Error correction of high-throughput sequencing datasets with non-uniform coverage Medvedev, Paul Scott, Eric Kakaradov, Boyko Pevzner, Pavel Bioinformatics Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria Motivation: The continuing improvements to high-throughput sequencing (HTS) platforms have begun to unfold a myriad of new applications. As a result, error correction of sequencing reads remains an important problem. Though several tools do an excellent job of correcting datasets where the reads are sampled close to uniformly, the problem of correcting reads coming from drastically non-uniform datasets, such as those from single-cell sequencing, remains open. Results: In this article, we develop the method Hammer for error correction without any uniformity assumptions. Hammer is based on a combination of a Hamming graph and a simple probabilistic model for sequencing errors. It is a simple and adaptable algorithm that improves on other tools on non-uniform single-cell data, while achieving comparable results on normal multi-cell data. Availability: http://www.cs.toronto.edu/~pashadag. Contact: pmedvedev@cs.ucsd.edu Oxford University Press 2011-07-01 2011-06-14 /pmc/articles/PMC3117386/ /pubmed/21685062 http://dx.doi.org/10.1093/bioinformatics/btr208 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria Medvedev, Paul Scott, Eric Kakaradov, Boyko Pevzner, Pavel Error correction of high-throughput sequencing datasets with non-uniform coverage |
title | Error correction of high-throughput sequencing datasets with non-uniform coverage |
title_full | Error correction of high-throughput sequencing datasets with non-uniform coverage |
title_fullStr | Error correction of high-throughput sequencing datasets with non-uniform coverage |
title_full_unstemmed | Error correction of high-throughput sequencing datasets with non-uniform coverage |
title_short | Error correction of high-throughput sequencing datasets with non-uniform coverage |
title_sort | error correction of high-throughput sequencing datasets with non-uniform coverage |
topic | Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117386/ https://www.ncbi.nlm.nih.gov/pubmed/21685062 http://dx.doi.org/10.1093/bioinformatics/btr208 |
work_keys_str_mv | AT medvedevpaul errorcorrectionofhighthroughputsequencingdatasetswithnonuniformcoverage AT scotteric errorcorrectionofhighthroughputsequencingdatasetswithnonuniformcoverage AT kakaradovboyko errorcorrectionofhighthroughputsequencingdatasetswithnonuniformcoverage AT pevznerpavel errorcorrectionofhighthroughputsequencingdatasetswithnonuniformcoverage |