Cargando…

Denoising PCR-amplified metagenome data

BACKGROUND: PCR amplification and high-throughput sequencing theoretically enable the characterization of the finest-scale diversity in natural microbial and viral populations, but each of these methods introduces random errors that are difficult to distinguish from genuine biological diversity. Sev...

Descripción completa

Detalles Bibliográficos
Autores principales: Rosen, Michael J, Callahan, Benjamin J, Fisher, Daniel S, Holmes, Susan P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3563472/
https://www.ncbi.nlm.nih.gov/pubmed/23113967
http://dx.doi.org/10.1186/1471-2105-13-283
_version_ 1782258191515516928
author Rosen, Michael J
Callahan, Benjamin J
Fisher, Daniel S
Holmes, Susan P
author_facet Rosen, Michael J
Callahan, Benjamin J
Fisher, Daniel S
Holmes, Susan P
author_sort Rosen, Michael J
collection PubMed
description BACKGROUND: PCR amplification and high-throughput sequencing theoretically enable the characterization of the finest-scale diversity in natural microbial and viral populations, but each of these methods introduces random errors that are difficult to distinguish from genuine biological diversity. Several approaches have been proposed to denoise these data but lack either speed or accuracy. RESULTS: We introduce a new denoising algorithm that we call DADA (Divisive Amplicon Denoising Algorithm). Without training data, DADA infers both the sample genotypes and error parameters that produced a metagenome data set. We demonstrate performance on control data sequenced on Roche’s 454 platform, and compare the results to the most accurate denoising software currently available, AmpliconNoise. CONCLUSIONS: DADA is more accurate and over an order of magnitude faster than AmpliconNoise. It eliminates the need for training data to establish error parameters, fully utilizes sequence-abundance information, and enables inclusion of context-dependent PCR error rates. It should be readily extensible to other sequencing platforms such as Illumina.
format Online
Article
Text
id pubmed-3563472
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35634722013-02-08 Denoising PCR-amplified metagenome data Rosen, Michael J Callahan, Benjamin J Fisher, Daniel S Holmes, Susan P BMC Bioinformatics Research Article BACKGROUND: PCR amplification and high-throughput sequencing theoretically enable the characterization of the finest-scale diversity in natural microbial and viral populations, but each of these methods introduces random errors that are difficult to distinguish from genuine biological diversity. Several approaches have been proposed to denoise these data but lack either speed or accuracy. RESULTS: We introduce a new denoising algorithm that we call DADA (Divisive Amplicon Denoising Algorithm). Without training data, DADA infers both the sample genotypes and error parameters that produced a metagenome data set. We demonstrate performance on control data sequenced on Roche’s 454 platform, and compare the results to the most accurate denoising software currently available, AmpliconNoise. CONCLUSIONS: DADA is more accurate and over an order of magnitude faster than AmpliconNoise. It eliminates the need for training data to establish error parameters, fully utilizes sequence-abundance information, and enables inclusion of context-dependent PCR error rates. It should be readily extensible to other sequencing platforms such as Illumina. BioMed Central 2012-10-31 /pmc/articles/PMC3563472/ /pubmed/23113967 http://dx.doi.org/10.1186/1471-2105-13-283 Text en Copyright ©2012 Rosen et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Rosen, Michael J
Callahan, Benjamin J
Fisher, Daniel S
Holmes, Susan P
Denoising PCR-amplified metagenome data
title Denoising PCR-amplified metagenome data
title_full Denoising PCR-amplified metagenome data
title_fullStr Denoising PCR-amplified metagenome data
title_full_unstemmed Denoising PCR-amplified metagenome data
title_short Denoising PCR-amplified metagenome data
title_sort denoising pcr-amplified metagenome data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3563472/
https://www.ncbi.nlm.nih.gov/pubmed/23113967
http://dx.doi.org/10.1186/1471-2105-13-283
work_keys_str_mv AT rosenmichaelj denoisingpcramplifiedmetagenomedata
AT callahanbenjaminj denoisingpcramplifiedmetagenomedata
AT fisherdaniels denoisingpcramplifiedmetagenomedata
AT holmessusanp denoisingpcramplifiedmetagenomedata