Cargando…

A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISEE

We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation), to assess sequencing quality (alternatively referred to as “noise” or “error”) within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within...

Descripción completa

Detalles Bibliográficos
Autores principales: Keegan, Kevin P., Trimble, William L., Wilkening, Jared, Wilke, Andreas, Harrison, Travis, D'Souza, Mark, Meyer, Folker
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3369934/
https://www.ncbi.nlm.nih.gov/pubmed/22685393
http://dx.doi.org/10.1371/journal.pcbi.1002541
_version_ 1782235112269676544
author Keegan, Kevin P.
Trimble, William L.
Wilkening, Jared
Wilke, Andreas
Harrison, Travis
D'Souza, Mark
Meyer, Folker
author_facet Keegan, Kevin P.
Trimble, William L.
Wilkening, Jared
Wilke, Andreas
Harrison, Travis
D'Souza, Mark
Meyer, Folker
author_sort Keegan, Kevin P.
collection PubMed
description We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation), to assess sequencing quality (alternatively referred to as “noise” or “error”) within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within a sample. It also provides global (whole sample) error estimates that can be used to identify samples with high or varying levels of sequencing error that may confound downstream analyses, particularly in the case of studies that utilize data from multiple sequencing samples. For shotgun metagenomic data, we believe that DRISEE provides estimates of sequencing error that are more accurate and less constrained by technical limitations than existing methods that rely on reference genomes or the use of scores (e.g. Phred). Here, DRISEE is applied to (non amplicon) data sets from both the 454 and Illumina platforms. The DRISEE error estimate is obtained by analyzing sets of artifactual duplicate reads (ADRs), a known by-product of both sequencing platforms. We present DRISEE as an open-source, platform-independent method to assess sequencing error in shotgun metagenomic data, and utilize it to discover previously uncharacterized error in de novo sequence data from the 454 and Illumina sequencing platforms.
format Online
Article
Text
id pubmed-3369934
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33699342012-06-08 A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISEE Keegan, Kevin P. Trimble, William L. Wilkening, Jared Wilke, Andreas Harrison, Travis D'Souza, Mark Meyer, Folker PLoS Comput Biol Research Article We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation), to assess sequencing quality (alternatively referred to as “noise” or “error”) within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within a sample. It also provides global (whole sample) error estimates that can be used to identify samples with high or varying levels of sequencing error that may confound downstream analyses, particularly in the case of studies that utilize data from multiple sequencing samples. For shotgun metagenomic data, we believe that DRISEE provides estimates of sequencing error that are more accurate and less constrained by technical limitations than existing methods that rely on reference genomes or the use of scores (e.g. Phred). Here, DRISEE is applied to (non amplicon) data sets from both the 454 and Illumina platforms. The DRISEE error estimate is obtained by analyzing sets of artifactual duplicate reads (ADRs), a known by-product of both sequencing platforms. We present DRISEE as an open-source, platform-independent method to assess sequencing error in shotgun metagenomic data, and utilize it to discover previously uncharacterized error in de novo sequence data from the 454 and Illumina sequencing platforms. Public Library of Science 2012-06-07 /pmc/articles/PMC3369934/ /pubmed/22685393 http://dx.doi.org/10.1371/journal.pcbi.1002541 Text en Keegan et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Keegan, Kevin P.
Trimble, William L.
Wilkening, Jared
Wilke, Andreas
Harrison, Travis
D'Souza, Mark
Meyer, Folker
A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISEE
title A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISEE
title_full A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISEE
title_fullStr A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISEE
title_full_unstemmed A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISEE
title_short A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISEE
title_sort platform-independent method for detecting errors in metagenomic sequencing data: drisee
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3369934/
https://www.ncbi.nlm.nih.gov/pubmed/22685393
http://dx.doi.org/10.1371/journal.pcbi.1002541
work_keys_str_mv AT keegankevinp aplatformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT trimblewilliaml aplatformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT wilkeningjared aplatformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT wilkeandreas aplatformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT harrisontravis aplatformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT dsouzamark aplatformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT meyerfolker aplatformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT keegankevinp platformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT trimblewilliaml platformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT wilkeningjared platformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT wilkeandreas platformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT harrisontravis platformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT dsouzamark platformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee
AT meyerfolker platformindependentmethodfordetectingerrorsinmetagenomicsequencingdatadrisee