Cargando…

Genome-wide discovery of missing genes in biological pathways of prokaryotes

ABSTRACT: BACKGROUND: Reconstruction of biological pathways is typically done through mapping well-characterized pathways of model organisms to a target genome, through orthologous gene mapping. A limitation of such pathway-mapping approaches is that the mapped pathway models are constrained by the...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yong, Mao, Fenglou, Li, Guojun, Xu, Ying
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044263/
https://www.ncbi.nlm.nih.gov/pubmed/21342538
http://dx.doi.org/10.1186/1471-2105-12-S1-S1
_version_ 1782198705695227904
author Chen, Yong
Mao, Fenglou
Li, Guojun
Xu, Ying
author_facet Chen, Yong
Mao, Fenglou
Li, Guojun
Xu, Ying
author_sort Chen, Yong
collection PubMed
description ABSTRACT: BACKGROUND: Reconstruction of biological pathways is typically done through mapping well-characterized pathways of model organisms to a target genome, through orthologous gene mapping. A limitation of such pathway-mapping approaches is that the mapped pathway models are constrained by the composition of the template pathways, e.g., some genes in a target pathway may not have corresponding genes in the template pathways, the so-called “missing gene” problem. METHODS: We present a novel pathway-expansion method for identifying additional genes that are possibly involved in a target pathway after pathway mapping, to fill holes caused by missing genes as well as to expand the mapped pathway model. The basic idea of the algorithm is to identify genes in the target genome whose homologous genes share common operons with homologs of any mapped pathway genes in some reference genome, and to add such genes to the target pathway if their functions are consistent with the cellular function of the target pathway. RESULTS: We have implemented this idea using a graph-theoretic approach and demonstrated the effectiveness of the algorithm on known pathways of E. coli in the KEGG database. On all KEGG pathways containing at least 5 genes, our method achieves an average of 60% positive predictive value (PPV) and the performance is increased with more seed genes added. Analysis shows that our method is highly robust. CONCLUSIONS: An effective method is presented to find missing genes in biological pathways of prokaryotes, which achieves high prediction reliability on E. coli at a genome level. Numerous missing genes are found to be related to knwon E. coli pathways, which can be further validated through biological experiments. Overall this method is robust and can be used for functional inference.
format Text
id pubmed-3044263
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30442632011-02-25 Genome-wide discovery of missing genes in biological pathways of prokaryotes Chen, Yong Mao, Fenglou Li, Guojun Xu, Ying BMC Bioinformatics Research ABSTRACT: BACKGROUND: Reconstruction of biological pathways is typically done through mapping well-characterized pathways of model organisms to a target genome, through orthologous gene mapping. A limitation of such pathway-mapping approaches is that the mapped pathway models are constrained by the composition of the template pathways, e.g., some genes in a target pathway may not have corresponding genes in the template pathways, the so-called “missing gene” problem. METHODS: We present a novel pathway-expansion method for identifying additional genes that are possibly involved in a target pathway after pathway mapping, to fill holes caused by missing genes as well as to expand the mapped pathway model. The basic idea of the algorithm is to identify genes in the target genome whose homologous genes share common operons with homologs of any mapped pathway genes in some reference genome, and to add such genes to the target pathway if their functions are consistent with the cellular function of the target pathway. RESULTS: We have implemented this idea using a graph-theoretic approach and demonstrated the effectiveness of the algorithm on known pathways of E. coli in the KEGG database. On all KEGG pathways containing at least 5 genes, our method achieves an average of 60% positive predictive value (PPV) and the performance is increased with more seed genes added. Analysis shows that our method is highly robust. CONCLUSIONS: An effective method is presented to find missing genes in biological pathways of prokaryotes, which achieves high prediction reliability on E. coli at a genome level. Numerous missing genes are found to be related to knwon E. coli pathways, which can be further validated through biological experiments. Overall this method is robust and can be used for functional inference. BioMed Central 2011-02-15 /pmc/articles/PMC3044263/ /pubmed/21342538 http://dx.doi.org/10.1186/1471-2105-12-S1-S1 Text en Copyright ©2011 Chen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Chen, Yong
Mao, Fenglou
Li, Guojun
Xu, Ying
Genome-wide discovery of missing genes in biological pathways of prokaryotes
title Genome-wide discovery of missing genes in biological pathways of prokaryotes
title_full Genome-wide discovery of missing genes in biological pathways of prokaryotes
title_fullStr Genome-wide discovery of missing genes in biological pathways of prokaryotes
title_full_unstemmed Genome-wide discovery of missing genes in biological pathways of prokaryotes
title_short Genome-wide discovery of missing genes in biological pathways of prokaryotes
title_sort genome-wide discovery of missing genes in biological pathways of prokaryotes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044263/
https://www.ncbi.nlm.nih.gov/pubmed/21342538
http://dx.doi.org/10.1186/1471-2105-12-S1-S1
work_keys_str_mv AT chenyong genomewidediscoveryofmissinggenesinbiologicalpathwaysofprokaryotes
AT maofenglou genomewidediscoveryofmissinggenesinbiologicalpathwaysofprokaryotes
AT liguojun genomewidediscoveryofmissinggenesinbiologicalpathwaysofprokaryotes
AT xuying genomewidediscoveryofmissinggenesinbiologicalpathwaysofprokaryotes