Cargando…
Filtration and Normalization of Sequencing Read Data in Whole-Metagenome Shotgun Samples
Ever-increasing affordability of next-generation sequencing makes whole-metagenome sequencing an attractive alternative to traditional 16S rDNA, RFLP, or culturing approaches for the analysis of microbiome samples. The advantage of whole-metagenome sequencing is that it allows direct inference of th...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5070866/ https://www.ncbi.nlm.nih.gov/pubmed/27760173 http://dx.doi.org/10.1371/journal.pone.0165015 |
_version_ | 1782461213452533760 |
---|---|
author | Chouvarine, Philippe Wiehlmann, Lutz Moran Losada, Patricia DeLuca, David S. Tümmler, Burkhard |
author_facet | Chouvarine, Philippe Wiehlmann, Lutz Moran Losada, Patricia DeLuca, David S. Tümmler, Burkhard |
author_sort | Chouvarine, Philippe |
collection | PubMed |
description | Ever-increasing affordability of next-generation sequencing makes whole-metagenome sequencing an attractive alternative to traditional 16S rDNA, RFLP, or culturing approaches for the analysis of microbiome samples. The advantage of whole-metagenome sequencing is that it allows direct inference of the metabolic capacity and physiological features of the studied metagenome without reliance on the knowledge of genotypes and phenotypes of the members of the bacterial community. It also makes it possible to overcome problems of 16S rDNA sequencing, such as unknown copy number of the 16S gene and lack of sufficient sequence similarity of the “universal” 16S primers to some of the target 16S genes. On the other hand, next-generation sequencing suffers from biases resulting in non-uniform coverage of the sequenced genomes. To overcome this difficulty, we present a model of GC-bias in sequencing metagenomic samples as well as filtration and normalization techniques necessary for accurate quantification of microbial organisms. While there has been substantial research in normalization and filtration of read-count data in such techniques as RNA-seq or Chip-seq, to our knowledge, this has not been the case for the field of whole-metagenome shotgun sequencing. The presented methods assume that complete genome references are available for most microorganisms of interest present in metagenomic samples. This is often a valid assumption in such fields as medical diagnostics of patient microbiota. Testing the model on two validation datasets showed four-fold reduction in root-mean-square error compared to non-normalized data in both cases. The presented methods can be applied to any pipeline for whole metagenome sequencing analysis relying on complete microbial genome references. We demonstrate that such pre-processing reduces the number of false positive hits and increases accuracy of abundance estimates. |
format | Online Article Text |
id | pubmed-5070866 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-50708662016-10-27 Filtration and Normalization of Sequencing Read Data in Whole-Metagenome Shotgun Samples Chouvarine, Philippe Wiehlmann, Lutz Moran Losada, Patricia DeLuca, David S. Tümmler, Burkhard PLoS One Research Article Ever-increasing affordability of next-generation sequencing makes whole-metagenome sequencing an attractive alternative to traditional 16S rDNA, RFLP, or culturing approaches for the analysis of microbiome samples. The advantage of whole-metagenome sequencing is that it allows direct inference of the metabolic capacity and physiological features of the studied metagenome without reliance on the knowledge of genotypes and phenotypes of the members of the bacterial community. It also makes it possible to overcome problems of 16S rDNA sequencing, such as unknown copy number of the 16S gene and lack of sufficient sequence similarity of the “universal” 16S primers to some of the target 16S genes. On the other hand, next-generation sequencing suffers from biases resulting in non-uniform coverage of the sequenced genomes. To overcome this difficulty, we present a model of GC-bias in sequencing metagenomic samples as well as filtration and normalization techniques necessary for accurate quantification of microbial organisms. While there has been substantial research in normalization and filtration of read-count data in such techniques as RNA-seq or Chip-seq, to our knowledge, this has not been the case for the field of whole-metagenome shotgun sequencing. The presented methods assume that complete genome references are available for most microorganisms of interest present in metagenomic samples. This is often a valid assumption in such fields as medical diagnostics of patient microbiota. Testing the model on two validation datasets showed four-fold reduction in root-mean-square error compared to non-normalized data in both cases. The presented methods can be applied to any pipeline for whole metagenome sequencing analysis relying on complete microbial genome references. We demonstrate that such pre-processing reduces the number of false positive hits and increases accuracy of abundance estimates. Public Library of Science 2016-10-19 /pmc/articles/PMC5070866/ /pubmed/27760173 http://dx.doi.org/10.1371/journal.pone.0165015 Text en © 2016 Chouvarine et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Chouvarine, Philippe Wiehlmann, Lutz Moran Losada, Patricia DeLuca, David S. Tümmler, Burkhard Filtration and Normalization of Sequencing Read Data in Whole-Metagenome Shotgun Samples |
title | Filtration and Normalization of Sequencing Read Data in Whole-Metagenome Shotgun Samples |
title_full | Filtration and Normalization of Sequencing Read Data in Whole-Metagenome Shotgun Samples |
title_fullStr | Filtration and Normalization of Sequencing Read Data in Whole-Metagenome Shotgun Samples |
title_full_unstemmed | Filtration and Normalization of Sequencing Read Data in Whole-Metagenome Shotgun Samples |
title_short | Filtration and Normalization of Sequencing Read Data in Whole-Metagenome Shotgun Samples |
title_sort | filtration and normalization of sequencing read data in whole-metagenome shotgun samples |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5070866/ https://www.ncbi.nlm.nih.gov/pubmed/27760173 http://dx.doi.org/10.1371/journal.pone.0165015 |
work_keys_str_mv | AT chouvarinephilippe filtrationandnormalizationofsequencingreaddatainwholemetagenomeshotgunsamples AT wiehlmannlutz filtrationandnormalizationofsequencingreaddatainwholemetagenomeshotgunsamples AT moranlosadapatricia filtrationandnormalizationofsequencingreaddatainwholemetagenomeshotgunsamples AT delucadavids filtrationandnormalizationofsequencingreaddatainwholemetagenomeshotgunsamples AT tummlerburkhard filtrationandnormalizationofsequencingreaddatainwholemetagenomeshotgunsamples |