Cargando…

The CanOE Strategy: Integrating Genomic and Metabolic Contexts across Multiple Prokaryote Genomes to Find Candidate Genes for Orphan Enzymes

Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activitie...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Adam Alexander Thil, Belda, Eugeni, Viari, Alain, Medigue, Claudine, Vallenet, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3364942/
https://www.ncbi.nlm.nih.gov/pubmed/22693442
http://dx.doi.org/10.1371/journal.pcbi.1002540
_version_ 1782234609453367296
author Smith, Adam Alexander Thil
Belda, Eugeni
Viari, Alain
Medigue, Claudine
Vallenet, David
author_facet Smith, Adam Alexander Thil
Belda, Eugeni
Viari, Alain
Medigue, Claudine
Vallenet, David
author_sort Smith, Adam Alexander Thil
collection PubMed
description Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates “genomic metabolons”, i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12.
format Online
Article
Text
id pubmed-3364942
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33649422012-06-12 The CanOE Strategy: Integrating Genomic and Metabolic Contexts across Multiple Prokaryote Genomes to Find Candidate Genes for Orphan Enzymes Smith, Adam Alexander Thil Belda, Eugeni Viari, Alain Medigue, Claudine Vallenet, David PLoS Comput Biol Research Article Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates “genomic metabolons”, i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12. Public Library of Science 2012-05-31 /pmc/articles/PMC3364942/ /pubmed/22693442 http://dx.doi.org/10.1371/journal.pcbi.1002540 Text en Smith et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Smith, Adam Alexander Thil
Belda, Eugeni
Viari, Alain
Medigue, Claudine
Vallenet, David
The CanOE Strategy: Integrating Genomic and Metabolic Contexts across Multiple Prokaryote Genomes to Find Candidate Genes for Orphan Enzymes
title The CanOE Strategy: Integrating Genomic and Metabolic Contexts across Multiple Prokaryote Genomes to Find Candidate Genes for Orphan Enzymes
title_full The CanOE Strategy: Integrating Genomic and Metabolic Contexts across Multiple Prokaryote Genomes to Find Candidate Genes for Orphan Enzymes
title_fullStr The CanOE Strategy: Integrating Genomic and Metabolic Contexts across Multiple Prokaryote Genomes to Find Candidate Genes for Orphan Enzymes
title_full_unstemmed The CanOE Strategy: Integrating Genomic and Metabolic Contexts across Multiple Prokaryote Genomes to Find Candidate Genes for Orphan Enzymes
title_short The CanOE Strategy: Integrating Genomic and Metabolic Contexts across Multiple Prokaryote Genomes to Find Candidate Genes for Orphan Enzymes
title_sort canoe strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3364942/
https://www.ncbi.nlm.nih.gov/pubmed/22693442
http://dx.doi.org/10.1371/journal.pcbi.1002540
work_keys_str_mv AT smithadamalexanderthil thecanoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT beldaeugeni thecanoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT viarialain thecanoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT medigueclaudine thecanoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT vallenetdavid thecanoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT smithadamalexanderthil canoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT beldaeugeni canoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT viarialain canoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT medigueclaudine canoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes
AT vallenetdavid canoestrategyintegratinggenomicandmetaboliccontextsacrossmultipleprokaryotegenomestofindcandidategenesfororphanenzymes