Cargando…

Exploring the Optimal Strategy to Predict Essential Genes in Microbes

Accurately predicting essential genes is important in many aspects of biology, medicine and bioengineering. In previous research, we have developed a machine learning based integrative algorithm to predict essential genes in bacterial species. This algorithm lends itself to two approaches for predic...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Jingyuan, Tan, Lirong, Lin, Xiaodong, Lu, Yao, Lu, Long J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4030871/
https://www.ncbi.nlm.nih.gov/pubmed/24970124
http://dx.doi.org/10.3390/biom2010001
_version_ 1782317432654790656
author Deng, Jingyuan
Tan, Lirong
Lin, Xiaodong
Lu, Yao
Lu, Long J.
author_facet Deng, Jingyuan
Tan, Lirong
Lin, Xiaodong
Lu, Yao
Lu, Long J.
author_sort Deng, Jingyuan
collection PubMed
description Accurately predicting essential genes is important in many aspects of biology, medicine and bioengineering. In previous research, we have developed a machine learning based integrative algorithm to predict essential genes in bacterial species. This algorithm lends itself to two approaches for predicting essential genes: learning the traits from known essential genes in the target organism, or transferring essential gene annotations from a closely related model organism. However, for an understudied microbe, each approach has its potential limitations. The first is constricted by the often small number of known essential genes. The second is limited by the availability of model organisms and by evolutionary distance. In this study, we aim to determine the optimal strategy for predicting essential genes by examining four microbes with well-characterized essential genes. Our results suggest that, unless the known essential genes are few, learning from the known essential genes in the target organism usually outperforms transferring essential gene annotations from a related model organism. In fact, the required number of known essential genes is surprisingly small to make accurate predictions. In prokaryotes, when the number of known essential genes is greater than 2% of total genes, this approach already comes close to its optimal performance. In eukaryotes, achieving the same best performance requires over 4% of total genes, reflecting the increased complexity of eukaryotic organisms. Combining the two approaches resulted in an increased performance when the known essential genes are few. Our investigation thus provides key information on accurately predicting essential genes and will greatly facilitate annotations of microbial genomes.
format Online
Article
Text
id pubmed-4030871
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-40308712014-06-24 Exploring the Optimal Strategy to Predict Essential Genes in Microbes Deng, Jingyuan Tan, Lirong Lin, Xiaodong Lu, Yao Lu, Long J. Biomolecules Article Accurately predicting essential genes is important in many aspects of biology, medicine and bioengineering. In previous research, we have developed a machine learning based integrative algorithm to predict essential genes in bacterial species. This algorithm lends itself to two approaches for predicting essential genes: learning the traits from known essential genes in the target organism, or transferring essential gene annotations from a closely related model organism. However, for an understudied microbe, each approach has its potential limitations. The first is constricted by the often small number of known essential genes. The second is limited by the availability of model organisms and by evolutionary distance. In this study, we aim to determine the optimal strategy for predicting essential genes by examining four microbes with well-characterized essential genes. Our results suggest that, unless the known essential genes are few, learning from the known essential genes in the target organism usually outperforms transferring essential gene annotations from a related model organism. In fact, the required number of known essential genes is surprisingly small to make accurate predictions. In prokaryotes, when the number of known essential genes is greater than 2% of total genes, this approach already comes close to its optimal performance. In eukaryotes, achieving the same best performance requires over 4% of total genes, reflecting the increased complexity of eukaryotic organisms. Combining the two approaches resulted in an increased performance when the known essential genes are few. Our investigation thus provides key information on accurately predicting essential genes and will greatly facilitate annotations of microbial genomes. MDPI 2011-12-26 /pmc/articles/PMC4030871/ /pubmed/24970124 http://dx.doi.org/10.3390/biom2010001 Text en © 2012 by the authors; licensee MDPI, Basel, Switzerland. http://creativecommons.org/licenses/by/3.0/ This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Article
Deng, Jingyuan
Tan, Lirong
Lin, Xiaodong
Lu, Yao
Lu, Long J.
Exploring the Optimal Strategy to Predict Essential Genes in Microbes
title Exploring the Optimal Strategy to Predict Essential Genes in Microbes
title_full Exploring the Optimal Strategy to Predict Essential Genes in Microbes
title_fullStr Exploring the Optimal Strategy to Predict Essential Genes in Microbes
title_full_unstemmed Exploring the Optimal Strategy to Predict Essential Genes in Microbes
title_short Exploring the Optimal Strategy to Predict Essential Genes in Microbes
title_sort exploring the optimal strategy to predict essential genes in microbes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4030871/
https://www.ncbi.nlm.nih.gov/pubmed/24970124
http://dx.doi.org/10.3390/biom2010001
work_keys_str_mv AT dengjingyuan exploringtheoptimalstrategytopredictessentialgenesinmicrobes
AT tanlirong exploringtheoptimalstrategytopredictessentialgenesinmicrobes
AT linxiaodong exploringtheoptimalstrategytopredictessentialgenesinmicrobes
AT luyao exploringtheoptimalstrategytopredictessentialgenesinmicrobes
AT lulongj exploringtheoptimalstrategytopredictessentialgenesinmicrobes