Cargando…

A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes

A common biological pathway reconstruction approach—as implemented by many automatic biological pathway services (such as the KAAS and RAST servers) and the functional annotation of metagenomic sequences—starts with the identification of protein functions or families (e.g., KO families for the KEGG...

Descripción completa

Detalles Bibliográficos
Autores principales: Ye, Yuzhen, Doak, Thomas G.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2714467/
https://www.ncbi.nlm.nih.gov/pubmed/19680427
http://dx.doi.org/10.1371/journal.pcbi.1000465
_version_ 1782169673897345024
author Ye, Yuzhen
Doak, Thomas G.
author_facet Ye, Yuzhen
Doak, Thomas G.
author_sort Ye, Yuzhen
collection PubMed
description A common biological pathway reconstruction approach—as implemented by many automatic biological pathway services (such as the KAAS and RAST servers) and the functional annotation of metagenomic sequences—starts with the identification of protein functions or families (e.g., KO families for the KEGG database and the FIG families for the SEED database) in the query sequences, followed by a direct mapping of the identified protein families onto pathways. Given a predicted patchwork of individual biochemical steps, some metric must be applied in deciding what pathways actually exist in the genome or metagenome represented by the sequences. Commonly, and straightforwardly, a complete biological pathway can be identified in a dataset if at least one of the steps associated with the pathway is found. We report, however, that this naïve mapping approach leads to an inflated estimate of biological pathways, and thus overestimates the functional diversity of the sample from which the DNA sequences are derived. We developed a parsimony approach, called MinPath (Minimal set of Pathways), for biological pathway reconstructions using protein family predictions, which yields a more conservative, yet more faithful, estimation of the biological pathways for a query dataset. MinPath identified far fewer pathways for the genomes collected in the KEGG database—as compared to the naïve mapping approach—eliminating some obviously spurious pathway annotations. Results from applying MinPath to several metagenomes indicate that the common methods used for metagenome annotation may significantly overestimate the biological pathways encoded by microbial communities.
format Text
id pubmed-2714467
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27144672009-08-14 A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes Ye, Yuzhen Doak, Thomas G. PLoS Comput Biol Research Article A common biological pathway reconstruction approach—as implemented by many automatic biological pathway services (such as the KAAS and RAST servers) and the functional annotation of metagenomic sequences—starts with the identification of protein functions or families (e.g., KO families for the KEGG database and the FIG families for the SEED database) in the query sequences, followed by a direct mapping of the identified protein families onto pathways. Given a predicted patchwork of individual biochemical steps, some metric must be applied in deciding what pathways actually exist in the genome or metagenome represented by the sequences. Commonly, and straightforwardly, a complete biological pathway can be identified in a dataset if at least one of the steps associated with the pathway is found. We report, however, that this naïve mapping approach leads to an inflated estimate of biological pathways, and thus overestimates the functional diversity of the sample from which the DNA sequences are derived. We developed a parsimony approach, called MinPath (Minimal set of Pathways), for biological pathway reconstructions using protein family predictions, which yields a more conservative, yet more faithful, estimation of the biological pathways for a query dataset. MinPath identified far fewer pathways for the genomes collected in the KEGG database—as compared to the naïve mapping approach—eliminating some obviously spurious pathway annotations. Results from applying MinPath to several metagenomes indicate that the common methods used for metagenome annotation may significantly overestimate the biological pathways encoded by microbial communities. Public Library of Science 2009-08-14 /pmc/articles/PMC2714467/ /pubmed/19680427 http://dx.doi.org/10.1371/journal.pcbi.1000465 Text en Ye, Doak. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Ye, Yuzhen
Doak, Thomas G.
A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes
title A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes
title_full A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes
title_fullStr A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes
title_full_unstemmed A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes
title_short A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes
title_sort parsimony approach to biological pathway reconstruction/inference for genomes and metagenomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2714467/
https://www.ncbi.nlm.nih.gov/pubmed/19680427
http://dx.doi.org/10.1371/journal.pcbi.1000465
work_keys_str_mv AT yeyuzhen aparsimonyapproachtobiologicalpathwayreconstructioninferenceforgenomesandmetagenomes
AT doakthomasg aparsimonyapproachtobiologicalpathwayreconstructioninferenceforgenomesandmetagenomes
AT yeyuzhen parsimonyapproachtobiologicalpathwayreconstructioninferenceforgenomesandmetagenomes
AT doakthomasg parsimonyapproachtobiologicalpathwayreconstructioninferenceforgenomesandmetagenomes