Cargando…

Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models

Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame....

Descripción completa

Detalles Bibliográficos
Autores principales: Benedict, Matthew N., Mundy, Michael B., Henry, Christopher S., Chia, Nicholas, Price, Nathan D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4199484/
https://www.ncbi.nlm.nih.gov/pubmed/25329157
http://dx.doi.org/10.1371/journal.pcbi.1003882
_version_ 1782339913120743424
author Benedict, Matthew N.
Mundy, Michael B.
Henry, Christopher S.
Chia, Nicholas
Price, Nathan D.
author_facet Benedict, Matthew N.
Mundy, Michael B.
Henry, Christopher S.
Chia, Nicholas
Price, Nathan D.
author_sort Benedict, Matthew N.
collection PubMed
description Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface.
format Online
Article
Text
id pubmed-4199484
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41994842014-10-21 Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models Benedict, Matthew N. Mundy, Michael B. Henry, Christopher S. Chia, Nicholas Price, Nathan D. PLoS Comput Biol Research Article Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface. Public Library of Science 2014-10-16 /pmc/articles/PMC4199484/ /pubmed/25329157 http://dx.doi.org/10.1371/journal.pcbi.1003882 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Benedict, Matthew N.
Mundy, Michael B.
Henry, Christopher S.
Chia, Nicholas
Price, Nathan D.
Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models
title Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models
title_full Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models
title_fullStr Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models
title_full_unstemmed Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models
title_short Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models
title_sort likelihood-based gene annotations for gap filling and quality assessment in genome-scale metabolic models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4199484/
https://www.ncbi.nlm.nih.gov/pubmed/25329157
http://dx.doi.org/10.1371/journal.pcbi.1003882
work_keys_str_mv AT benedictmatthewn likelihoodbasedgeneannotationsforgapfillingandqualityassessmentingenomescalemetabolicmodels
AT mundymichaelb likelihoodbasedgeneannotationsforgapfillingandqualityassessmentingenomescalemetabolicmodels
AT henrychristophers likelihoodbasedgeneannotationsforgapfillingandqualityassessmentingenomescalemetabolicmodels
AT chianicholas likelihoodbasedgeneannotationsforgapfillingandqualityassessmentingenomescalemetabolicmodels
AT pricenathand likelihoodbasedgeneannotationsforgapfillingandqualityassessmentingenomescalemetabolicmodels