Cargando…

Bayesian prediction of bacterial growth temperature range based on genome sequences

BACKGROUND: The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomi...

Descripción completa

Detalles Bibliográficos
Autores principales: Jensen, Dan B, Vesth, Tammi C, Hallin, Peter F, Pedersen, Anders G, Ussery, David W
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3521210/
https://www.ncbi.nlm.nih.gov/pubmed/23282160
http://dx.doi.org/10.1186/1471-2164-13-S7-S3
_version_ 1782252905763438592
author Jensen, Dan B
Vesth, Tammi C
Hallin, Peter F
Pedersen, Anders G
Ussery, David W
author_facet Jensen, Dan B
Vesth, Tammi C
Hallin, Peter F
Pedersen, Anders G
Ussery, David W
author_sort Jensen, Dan B
collection PubMed
description BACKGROUND: The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomic sequence, would thus allow for an efficient and targeted search for production organisms, reducing the need for culturing experiments. RESULTS: This study found a total of 40 protein families useful for distinction between three thermophilicity classes (thermophiles, mesophiles and psychrophiles). The predictive performance of these protein families were compared to those of 87 basic sequence features (relative use of amino acids and codons, genomic and 16S rDNA AT content and genome size). When using naïve Bayesian inference, it was possible to correctly predict the optimal temperature range with a Matthews correlation coefficient of up to 0.68. The best predictive performance was always achieved by including protein families as well as structural features, compared to either of these alone. A dedicated computer program was created to perform these predictions. CONCLUSIONS: This study shows that protein families associated with specific thermophilicity classes can provide effective input data for thermophilicity prediction, and that the naïve Bayesian approach is effective for such a task. The program created for this study is able to efficiently distinguish between thermophilic, mesophilic and psychrophilic adapted bacterial genomes.
format Online
Article
Text
id pubmed-3521210
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35212102012-12-14 Bayesian prediction of bacterial growth temperature range based on genome sequences Jensen, Dan B Vesth, Tammi C Hallin, Peter F Pedersen, Anders G Ussery, David W BMC Genomics Proceedings BACKGROUND: The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomic sequence, would thus allow for an efficient and targeted search for production organisms, reducing the need for culturing experiments. RESULTS: This study found a total of 40 protein families useful for distinction between three thermophilicity classes (thermophiles, mesophiles and psychrophiles). The predictive performance of these protein families were compared to those of 87 basic sequence features (relative use of amino acids and codons, genomic and 16S rDNA AT content and genome size). When using naïve Bayesian inference, it was possible to correctly predict the optimal temperature range with a Matthews correlation coefficient of up to 0.68. The best predictive performance was always achieved by including protein families as well as structural features, compared to either of these alone. A dedicated computer program was created to perform these predictions. CONCLUSIONS: This study shows that protein families associated with specific thermophilicity classes can provide effective input data for thermophilicity prediction, and that the naïve Bayesian approach is effective for such a task. The program created for this study is able to efficiently distinguish between thermophilic, mesophilic and psychrophilic adapted bacterial genomes. BioMed Central 2012-12-07 /pmc/articles/PMC3521210/ /pubmed/23282160 http://dx.doi.org/10.1186/1471-2164-13-S7-S3 Text en Copyright ©2012 Jensen et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Jensen, Dan B
Vesth, Tammi C
Hallin, Peter F
Pedersen, Anders G
Ussery, David W
Bayesian prediction of bacterial growth temperature range based on genome sequences
title Bayesian prediction of bacterial growth temperature range based on genome sequences
title_full Bayesian prediction of bacterial growth temperature range based on genome sequences
title_fullStr Bayesian prediction of bacterial growth temperature range based on genome sequences
title_full_unstemmed Bayesian prediction of bacterial growth temperature range based on genome sequences
title_short Bayesian prediction of bacterial growth temperature range based on genome sequences
title_sort bayesian prediction of bacterial growth temperature range based on genome sequences
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3521210/
https://www.ncbi.nlm.nih.gov/pubmed/23282160
http://dx.doi.org/10.1186/1471-2164-13-S7-S3
work_keys_str_mv AT jensendanb bayesianpredictionofbacterialgrowthtemperaturerangebasedongenomesequences
AT vesthtammic bayesianpredictionofbacterialgrowthtemperaturerangebasedongenomesequences
AT hallinpeterf bayesianpredictionofbacterialgrowthtemperaturerangebasedongenomesequences
AT pedersenandersg bayesianpredictionofbacterialgrowthtemperaturerangebasedongenomesequences
AT usserydavidw bayesianpredictionofbacterialgrowthtemperaturerangebasedongenomesequences