Cargando…

Analysis of Metagenomic Data Containing High Biodiversity Levels

In this paper we have addressed the problem of analysing Next Generation Sequencing samples with an expected large biodiversity content. We analysed several well-known 16S rRNA datasets from experimental samples, including both large and short sequences, in numbers of tens of thousands, in addition...

Descripción completa

Detalles Bibliográficos
Autores principales: Valverde, José R., Mellado, Rafael P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3591453/
https://www.ncbi.nlm.nih.gov/pubmed/23505458
http://dx.doi.org/10.1371/journal.pone.0058118
_version_ 1782262060748374016
author Valverde, José R.
Mellado, Rafael P.
author_facet Valverde, José R.
Mellado, Rafael P.
author_sort Valverde, José R.
collection PubMed
description In this paper we have addressed the problem of analysing Next Generation Sequencing samples with an expected large biodiversity content. We analysed several well-known 16S rRNA datasets from experimental samples, including both large and short sequences, in numbers of tens of thousands, in addition to carefully crafted synthetic datasets containing more than 7000 OTUs. From this data analysis several patterns were identified and used to develop new guidelines for experimentation in conditions of high biodiversity. We analysed the suitability of different clustering packages for these type of situations, the problem of even sampling, the relative effectiveness of Chao1 and ACE estimators as well as their effect on sampling size for a variety of population distributions. As regards practical analysis procedures, we advocated an approach that retains as much high-quality experimental data as possible. By carefully applying selection rules combining the taxonomic assignment with clustering strategies, we derived a set of recommendations for ultra-sequencing data analysis at high biodiversity levels.
format Online
Article
Text
id pubmed-3591453
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35914532013-03-15 Analysis of Metagenomic Data Containing High Biodiversity Levels Valverde, José R. Mellado, Rafael P. PLoS One Research Article In this paper we have addressed the problem of analysing Next Generation Sequencing samples with an expected large biodiversity content. We analysed several well-known 16S rRNA datasets from experimental samples, including both large and short sequences, in numbers of tens of thousands, in addition to carefully crafted synthetic datasets containing more than 7000 OTUs. From this data analysis several patterns were identified and used to develop new guidelines for experimentation in conditions of high biodiversity. We analysed the suitability of different clustering packages for these type of situations, the problem of even sampling, the relative effectiveness of Chao1 and ACE estimators as well as their effect on sampling size for a variety of population distributions. As regards practical analysis procedures, we advocated an approach that retains as much high-quality experimental data as possible. By carefully applying selection rules combining the taxonomic assignment with clustering strategies, we derived a set of recommendations for ultra-sequencing data analysis at high biodiversity levels. Public Library of Science 2013-03-07 /pmc/articles/PMC3591453/ /pubmed/23505458 http://dx.doi.org/10.1371/journal.pone.0058118 Text en © 2013 Valverde, Mellado http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Valverde, José R.
Mellado, Rafael P.
Analysis of Metagenomic Data Containing High Biodiversity Levels
title Analysis of Metagenomic Data Containing High Biodiversity Levels
title_full Analysis of Metagenomic Data Containing High Biodiversity Levels
title_fullStr Analysis of Metagenomic Data Containing High Biodiversity Levels
title_full_unstemmed Analysis of Metagenomic Data Containing High Biodiversity Levels
title_short Analysis of Metagenomic Data Containing High Biodiversity Levels
title_sort analysis of metagenomic data containing high biodiversity levels
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3591453/
https://www.ncbi.nlm.nih.gov/pubmed/23505458
http://dx.doi.org/10.1371/journal.pone.0058118
work_keys_str_mv AT valverdejoser analysisofmetagenomicdatacontaininghighbiodiversitylevels
AT melladorafaelp analysisofmetagenomicdatacontaininghighbiodiversitylevels