Cargando…

Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models

BACKGROUND: Horizontal gene transfer (HGT) is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications...

Descripción completa

Detalles Bibliográficos
Autores principales: Waack, Stephan, Keller, Oliver, Asper, Roman, Brodag, Thomas, Damm, Carsten, Fricke, Wolfgang Florian, Surovcik, Katharina, Meinicke, Peter, Merkl, Rainer
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1489950/
https://www.ncbi.nlm.nih.gov/pubmed/16542435
http://dx.doi.org/10.1186/1471-2105-7-142
_version_ 1782128369374068736
author Waack, Stephan
Keller, Oliver
Asper, Roman
Brodag, Thomas
Damm, Carsten
Fricke, Wolfgang Florian
Surovcik, Katharina
Meinicke, Peter
Merkl, Rainer
author_facet Waack, Stephan
Keller, Oliver
Asper, Roman
Brodag, Thomas
Damm, Carsten
Fricke, Wolfgang Florian
Surovcik, Katharina
Meinicke, Peter
Merkl, Rainer
author_sort Waack, Stephan
collection PubMed
description BACKGROUND: Horizontal gene transfer (HGT) is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications or mutations. For a precise characterization, algorithms are needed that identify transfer events with high reliability. Frequently, the transferred pieces of DNA have a considerable length, comprise several genes and are called genomic islands (GIs) or more specifically pathogenicity or symbiotic islands. RESULTS: We have implemented the program SIGI-HMM that predicts GIs and the putative donor of each individual alien gene. It is based on the analysis of codon usage (CU) of each individual gene of a genome under study. CU of each gene is compared against a carefully selected set of CU tables representing microbial donors or highly expressed genes. Multiple tests are used to identify putatively alien genes, to predict putative donors and to mask putatively highly expressed genes. Thus, we determine the states and emission probabilities of an inhomogeneous hidden Markov model working on gene level. For the transition probabilities, we draw upon classical test theory with the intention of integrating a sensitivity controller in a consistent manner. SIGI-HMM was written in JAVA and is publicly available. It accepts as input any file created according to the EMBL-format. It generates output in the common GFF format readable for genome browsers. Benchmark tests showed that the output of SIGI-HMM is in agreement with known findings. Its predictions were both consistent with annotated GIs and with predictions generated by different methods. CONCLUSION: SIGI-HMM is a sensitive tool for the identification of GIs in microbial genomes. It allows to interactively analyze genomes in detail and to generate or to test hypotheses about the origin of acquired genes.
format Text
id pubmed-1489950
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-14899502006-07-10 Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models Waack, Stephan Keller, Oliver Asper, Roman Brodag, Thomas Damm, Carsten Fricke, Wolfgang Florian Surovcik, Katharina Meinicke, Peter Merkl, Rainer BMC Bioinformatics Software BACKGROUND: Horizontal gene transfer (HGT) is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications or mutations. For a precise characterization, algorithms are needed that identify transfer events with high reliability. Frequently, the transferred pieces of DNA have a considerable length, comprise several genes and are called genomic islands (GIs) or more specifically pathogenicity or symbiotic islands. RESULTS: We have implemented the program SIGI-HMM that predicts GIs and the putative donor of each individual alien gene. It is based on the analysis of codon usage (CU) of each individual gene of a genome under study. CU of each gene is compared against a carefully selected set of CU tables representing microbial donors or highly expressed genes. Multiple tests are used to identify putatively alien genes, to predict putative donors and to mask putatively highly expressed genes. Thus, we determine the states and emission probabilities of an inhomogeneous hidden Markov model working on gene level. For the transition probabilities, we draw upon classical test theory with the intention of integrating a sensitivity controller in a consistent manner. SIGI-HMM was written in JAVA and is publicly available. It accepts as input any file created according to the EMBL-format. It generates output in the common GFF format readable for genome browsers. Benchmark tests showed that the output of SIGI-HMM is in agreement with known findings. Its predictions were both consistent with annotated GIs and with predictions generated by different methods. CONCLUSION: SIGI-HMM is a sensitive tool for the identification of GIs in microbial genomes. It allows to interactively analyze genomes in detail and to generate or to test hypotheses about the origin of acquired genes. BioMed Central 2006-03-16 /pmc/articles/PMC1489950/ /pubmed/16542435 http://dx.doi.org/10.1186/1471-2105-7-142 Text en Copyright © 2006 Waack et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Waack, Stephan
Keller, Oliver
Asper, Roman
Brodag, Thomas
Damm, Carsten
Fricke, Wolfgang Florian
Surovcik, Katharina
Meinicke, Peter
Merkl, Rainer
Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models
title Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models
title_full Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models
title_fullStr Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models
title_full_unstemmed Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models
title_short Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models
title_sort score-based prediction of genomic islands in prokaryotic genomes using hidden markov models
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1489950/
https://www.ncbi.nlm.nih.gov/pubmed/16542435
http://dx.doi.org/10.1186/1471-2105-7-142
work_keys_str_mv AT waackstephan scorebasedpredictionofgenomicislandsinprokaryoticgenomesusinghiddenmarkovmodels
AT kelleroliver scorebasedpredictionofgenomicislandsinprokaryoticgenomesusinghiddenmarkovmodels
AT asperroman scorebasedpredictionofgenomicislandsinprokaryoticgenomesusinghiddenmarkovmodels
AT brodagthomas scorebasedpredictionofgenomicislandsinprokaryoticgenomesusinghiddenmarkovmodels
AT dammcarsten scorebasedpredictionofgenomicislandsinprokaryoticgenomesusinghiddenmarkovmodels
AT frickewolfgangflorian scorebasedpredictionofgenomicislandsinprokaryoticgenomesusinghiddenmarkovmodels
AT surovcikkatharina scorebasedpredictionofgenomicislandsinprokaryoticgenomesusinghiddenmarkovmodels
AT meinickepeter scorebasedpredictionofgenomicislandsinprokaryoticgenomesusinghiddenmarkovmodels
AT merklrainer scorebasedpredictionofgenomicislandsinprokaryoticgenomesusinghiddenmarkovmodels