Cargando…
A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes
A common biological pathway reconstruction approach—as implemented by many automatic biological pathway services (such as the KAAS and RAST servers) and the functional annotation of metagenomic sequences—starts with the identification of protein functions or families (e.g., KO families for the KEGG...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2714467/ https://www.ncbi.nlm.nih.gov/pubmed/19680427 http://dx.doi.org/10.1371/journal.pcbi.1000465 |
_version_ | 1782169673897345024 |
---|---|
author | Ye, Yuzhen Doak, Thomas G. |
author_facet | Ye, Yuzhen Doak, Thomas G. |
author_sort | Ye, Yuzhen |
collection | PubMed |
description | A common biological pathway reconstruction approach—as implemented by many automatic biological pathway services (such as the KAAS and RAST servers) and the functional annotation of metagenomic sequences—starts with the identification of protein functions or families (e.g., KO families for the KEGG database and the FIG families for the SEED database) in the query sequences, followed by a direct mapping of the identified protein families onto pathways. Given a predicted patchwork of individual biochemical steps, some metric must be applied in deciding what pathways actually exist in the genome or metagenome represented by the sequences. Commonly, and straightforwardly, a complete biological pathway can be identified in a dataset if at least one of the steps associated with the pathway is found. We report, however, that this naïve mapping approach leads to an inflated estimate of biological pathways, and thus overestimates the functional diversity of the sample from which the DNA sequences are derived. We developed a parsimony approach, called MinPath (Minimal set of Pathways), for biological pathway reconstructions using protein family predictions, which yields a more conservative, yet more faithful, estimation of the biological pathways for a query dataset. MinPath identified far fewer pathways for the genomes collected in the KEGG database—as compared to the naïve mapping approach—eliminating some obviously spurious pathway annotations. Results from applying MinPath to several metagenomes indicate that the common methods used for metagenome annotation may significantly overestimate the biological pathways encoded by microbial communities. |
format | Text |
id | pubmed-2714467 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-27144672009-08-14 A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes Ye, Yuzhen Doak, Thomas G. PLoS Comput Biol Research Article A common biological pathway reconstruction approach—as implemented by many automatic biological pathway services (such as the KAAS and RAST servers) and the functional annotation of metagenomic sequences—starts with the identification of protein functions or families (e.g., KO families for the KEGG database and the FIG families for the SEED database) in the query sequences, followed by a direct mapping of the identified protein families onto pathways. Given a predicted patchwork of individual biochemical steps, some metric must be applied in deciding what pathways actually exist in the genome or metagenome represented by the sequences. Commonly, and straightforwardly, a complete biological pathway can be identified in a dataset if at least one of the steps associated with the pathway is found. We report, however, that this naïve mapping approach leads to an inflated estimate of biological pathways, and thus overestimates the functional diversity of the sample from which the DNA sequences are derived. We developed a parsimony approach, called MinPath (Minimal set of Pathways), for biological pathway reconstructions using protein family predictions, which yields a more conservative, yet more faithful, estimation of the biological pathways for a query dataset. MinPath identified far fewer pathways for the genomes collected in the KEGG database—as compared to the naïve mapping approach—eliminating some obviously spurious pathway annotations. Results from applying MinPath to several metagenomes indicate that the common methods used for metagenome annotation may significantly overestimate the biological pathways encoded by microbial communities. Public Library of Science 2009-08-14 /pmc/articles/PMC2714467/ /pubmed/19680427 http://dx.doi.org/10.1371/journal.pcbi.1000465 Text en Ye, Doak. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Ye, Yuzhen Doak, Thomas G. A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes |
title | A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes |
title_full | A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes |
title_fullStr | A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes |
title_full_unstemmed | A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes |
title_short | A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes |
title_sort | parsimony approach to biological pathway reconstruction/inference for genomes and metagenomes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2714467/ https://www.ncbi.nlm.nih.gov/pubmed/19680427 http://dx.doi.org/10.1371/journal.pcbi.1000465 |
work_keys_str_mv | AT yeyuzhen aparsimonyapproachtobiologicalpathwayreconstructioninferenceforgenomesandmetagenomes AT doakthomasg aparsimonyapproachtobiologicalpathwayreconstructioninferenceforgenomesandmetagenomes AT yeyuzhen parsimonyapproachtobiologicalpathwayreconstructioninferenceforgenomesandmetagenomes AT doakthomasg parsimonyapproachtobiologicalpathwayreconstructioninferenceforgenomesandmetagenomes |