Cargando…

Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads

Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates....

Descripción completa

Detalles Bibliográficos
Autores principales: Xia, Li C., Cram, Jacob A., Chen, Ting, Fuhrman, Jed A., Sun, Fengzhu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3232206/
https://www.ncbi.nlm.nih.gov/pubmed/22162995
http://dx.doi.org/10.1371/journal.pone.0027992
_version_ 1782218334379442176
author Xia, Li C.
Cram, Jacob A.
Chen, Ting
Fuhrman, Jed A.
Sun, Fengzhu
author_facet Xia, Li C.
Cram, Jacob A.
Chen, Ting
Fuhrman, Jed A.
Sun, Fengzhu
author_sort Xia, Li C.
collection PubMed
description Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data- sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes.
format Online
Article
Text
id pubmed-3232206
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32322062011-12-09 Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads Xia, Li C. Cram, Jacob A. Chen, Ting Fuhrman, Jed A. Sun, Fengzhu PLoS One Research Article Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data- sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes. Public Library of Science 2011-12-06 /pmc/articles/PMC3232206/ /pubmed/22162995 http://dx.doi.org/10.1371/journal.pone.0027992 Text en Xia et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Xia, Li C.
Cram, Jacob A.
Chen, Ting
Fuhrman, Jed A.
Sun, Fengzhu
Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads
title Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads
title_full Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads
title_fullStr Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads
title_full_unstemmed Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads
title_short Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads
title_sort accurate genome relative abundance estimation based on shotgun metagenomic reads
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3232206/
https://www.ncbi.nlm.nih.gov/pubmed/22162995
http://dx.doi.org/10.1371/journal.pone.0027992
work_keys_str_mv AT xialic accurategenomerelativeabundanceestimationbasedonshotgunmetagenomicreads
AT cramjacoba accurategenomerelativeabundanceestimationbasedonshotgunmetagenomicreads
AT chenting accurategenomerelativeabundanceestimationbasedonshotgunmetagenomicreads
AT fuhrmanjeda accurategenomerelativeabundanceestimationbasedonshotgunmetagenomicreads
AT sunfengzhu accurategenomerelativeabundanceestimationbasedonshotgunmetagenomicreads